Quantcast
Channel: YMC » Hadoop
Browsing latest articles
Browse All 10 View Live

Hadoop training by Cloudera

Last week I attended an admin training about Hadoop, held by Cloudera in a comfortable and well prepared location in London. This 3-day course covers several topics of the Hadoop ecosystem, all within...

View Article



Case Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 1

This week we were inspired to do some research, driven by an idea: It must be possible to bring the concepts of tracking users in the online world to retail stores. We are not the experts in retail but...

View Article

Case Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 2

Following on from Jean-Pierre’s introduction to this experiment in part 1, I will now expand on the technical details of the data ingestion process using Flume. As you can see in figure 2 from the...

View Article

Case Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 3

In the previous article we described how to collect WiFi router logs with Flume to store in HDFS. This article will describe how we did the transformation, parsing, filtering and finally loading into...

View Article

Case Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 4

In the previous article we explained how to parse, transform and finally load data into Hive’s warehouse. Now it’s time to talk about querying the data. Before we start, here is how a sample of the...

View Article


Lambda Architecture, Part 1

We are witnessing a paradigm shift from batch based data processing to real-time data processing using the Hadoop framework. Despite this progress it is still a challenge to process web-scale data in...

View Article

Hello Europe! Hadoop has landed.

Last week we were in Amsterdam at the Hadoop Summit 2013. This was the first Hadoop Summit in Europe, so things are picking up momentum over here too. #HadoopSummit great that #hadoop has landed on the...

View Article

YMC & Big Data Analytics: first authorized Cloudera training partner in...

Having successfully completed Authorized Cloudera Training Partner certification, YMC is now an official Cloudera University training delivery partner, offering courses on Hadoop-based Big Data in...

View Article


Geo-based tweet analysis to alert emergency services

Introduction The constantly rising number of Twitter tweets includes a massive amount of data – perfectly suited for analysis using algorithms and techniques of the Big Data and Machine Learning...

View Article


How to install Cloudera Manager and Cloudera Search with support from Ansible

The Cloudera Manager is a great tool to orchestrate your CDH based Hadoop cluster. You can use it from cluster installation, deploying configurations, restarting daemons to monitoring each cluster...

View Article
Browsing latest articles
Browse All 10 View Live




Latest Images