Thursday, August 21, 2014


So far we have covered following topics in Big Data

                         Big Data- The Rise and the Future

·         Big data: Technology Stack

·         Big Data: Hadoop Distributed Filesystem (HDFS)

·         Big Data: Map Reduce

·         Big Data- Installing Hadoop ( Single Node)

·         Big Data- Apache Hadoop Multi Node

·         Big Data: Troubleshooting, Administering and optimizing Hadoop

·         Big Data: Managing HDFS

·         Big Data: Map Reduce Development

·         Big Data: Introduction to Pig

In this blog we will discuss AMAZON ELASTIC MAPREDUCE (EMR)


What is EMR?
-Webservice on top of AWS that uses EC2 for processing and S3 for storage
-Data is pulled from S3, processed by auto-configured EC2 cluster and results pushed back to S3
-Crunch your data in the cloud without the hassle of managing your own cluster/infrastructure!!

What is an EMR Job Flow?
-Data processing wizard
-Hive,mapreduce, hbase and pig

The only thing we need to do is configure EMR Job Flow. Once its configured, rest is very easy. Even EMR JOB FLOW is very easy in amazon.

Thats it.

No comments:

Post a Comment

Featured Post

Amazon Route 53

Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service.Route 53  perform three main functions in any...