Tuesday, August 5, 2014

Big Data: Introduction to Pig

So far we have covered the following topics in the big data. You can click on the hyperlink and go to a specific topic.

-Map Reduce developmen

In this blog, we cover the topic, Pig. 
So what is Pig? Pig is high level data scripting language.

The major components of pig are:
-Runtime engine
-language Pig Latin

Execution Tools| Mode
- Grunt shell or commandline
-local or map reduce mode
-interactive or batch

Below is the Pig Latin Script


lets look at the script little closer. first three lines A, B and C are loading data in the cluster. D and E are the filters


PIG Vs SQL

SQL (Declarative)
-Write queries from inside out
-Complex queries can get different to comprehend

PIG Latin (Procedural)
-Data flows step by step
-clean and easy to write
-data can be stored at any point in the pipe


Pig Latin

- collection of statements
-statements built using operators/expressions
-

No comments:

Post a Comment

Featured Post

Amazon Route 53

Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service.Route 53  perform three main functions in any...