Hadoop training by Cloudera
Last week I attended an admin training about Hadoop, held by Cloudera in a comfortable and well prepared location in London. This 3-day course covers several topics of the Hadoop ecosystem, all within...
View ArticleCase Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 1
This week we were inspired to do some research, driven by an idea: It must be possible to bring the concepts of tracking users in the online world to retail stores. We are not the experts in retail but...
View ArticleCase Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 2
Following on from Jean-Pierre’s introduction to this experiment in part 1, I will now expand on the technical details of the data ingestion process using Flume. As you can see in figure 2 from the...
View ArticleCase Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 3
In the previous article we described how to collect WiFi router logs with Flume to store in HDFS. This article will describe how we did the transformation, parsing, filtering and finally loading into...
View ArticleCase Study: Retail WiFi Log-file Analysis with Hadoop and Impala, Part 4
In the previous article we explained how to parse, transform and finally load data into Hive’s warehouse. Now it’s time to talk about querying the data. Before we start, here is how a sample of the...
View ArticleLambda Architecture, Part 1
We are witnessing a paradigm shift from batch based data processing to real-time data processing using the Hadoop framework. Despite this progress it is still a challenge to process web-scale data in...
View ArticleHello Europe! Hadoop has landed.
Last week we were in Amsterdam at the Hadoop Summit 2013. This was the first Hadoop Summit in Europe, so things are picking up momentum over here too. #HadoopSummit great that #hadoop has landed on the...
View ArticleYMC & Big Data Analytics: first authorized Cloudera training partner in...
Having successfully completed Authorized Cloudera Training Partner certification, YMC is now an official Cloudera University training delivery partner, offering courses on Hadoop-based Big Data in...
View ArticleGeo-based tweet analysis to alert emergency services
Introduction The constantly rising number of Twitter tweets includes a massive amount of data – perfectly suited for analysis using algorithms and techniques of the Big Data and Machine Learning...
View ArticleHow to install Cloudera Manager and Cloudera Search with support from Ansible
The Cloudera Manager is a great tool to orchestrate your CDH based Hadoop cluster. You can use it from cluster installation, deploying configurations, restarting daemons to monitoring each cluster...
View Article
More Pages to Explore .....