Skip to main content

Big Data: Hadoop Distributed Filesystem (HDFS)

In this Blog we will see details of HDFS. HDFS is not a typical file system. We will start with high level view of HDFS. Then we will go deeper.

HDFS Architecture

On the left hand side we have a single rack cluster and on the right hand side multi rack cluster.
First thing to note about HDFS architecture that it is a master slave architecture. We have the server roles:
-Name node
-Secondary Name Node
-Job Tracker
-Data Nodes
-Task trackers

 Lets look at the left side diagram.

Name Node

Name node, think it is a HDFS daemon or controller for all the data nodes. Any request that comes to the file system like to create a directory. To create a file or to read or write  a file, is gone go through the name node. Name node essentially manages the file system namespace. It holds in the memory a snapshot of what the file system really looks like.

 It also handles block mappings. Whenever you put a file in the HDFS, its gonna break the file into blocks and spread it across all the data nodes. So the name nodes know where all that of blocks are within the clusters.

Data node

Data nodes are the work horse of the system. They are the one who is actually doing all the block operations. They would be receiving instructions from the name node of where to put the blocks and how to put the blocks.

Data nodes are also responsible for replication. They would be receiving instructions from the name node but data nodes are the one who physically do the replication

Secondary Name Node
Secondary name node is not really secondary name node as the name implies. Lot of people in the community are renaming it as Checkpoint node. It’s just there to take snapshot of the name node now and then. Its like backup, system restore for the name node

Now lets take our attention to Right side multi-rack cluster



You can see we have broken the roles into each of there separate servers and in their own rack. So you can have a single rack cluster or multi-rack cluster. With the multi-rack cluster, new challenges appear.

The 1st challenge is Data loss challenge: Remember in our last blog we discussed, When we setup HDFS, we set up replication factor, default is 3. What it means that when we put this file in Hadoop, it will make 3 copies of every block spread across all the nodes in the cluster. So if we lose a node, then we would still have all the data. It will re replicate the data the on that node to the other nodes. It’s the fault tolerance in Hadoop.
so what happens if we lose an entire rack?  DATA LOSS. This is where Rack Awareness  feature helps us. It does this by understanding our network topology. its a manual thing. We need to describe our network topology the name node. Once its understands it and rack aware, anytime the data comes, name node makes sure that multiple copies are available on multiple data nodes and rack.

The 2nd Challenge is Network Performance: Bandwidth, its a scarce resource. We are going to make an assumption that in a rack there is much more bandwidth and low latency in comparison to traffic between 2 racks. Wouldn't it be great if rack awareness keeps our bulky flow within the rack? Actually it can.

So Rack awareness, reliable storage and high throughput are the main features of Multi-rack 
clusters.

Now lets go to HDFS internals

Name Node: it is the single most server in the cluster. it is a single point of failure. The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself.Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file. The NameNode responds the successful requests by returning a list of relevant DataNode servers where the data lives.The NameNode is a Single Point of Failure for the HDFS Cluster. HDFS is not currently a High Availability system. When the NameNode goes down, the file system goes offline.

So in short
- contains file system metadata
-Clients communicate/controller
- Monitors health

so lets go to file system metadata stuff

The name node holds in memory snapshot of entire file system. it looks something like above diagram, tracks all the files, their replication value and it also holds mapping of block ids to data nodes. name node also has an edit log or journal. it keeps track of all the transactions. actually its the clients who write the edit log or journal. so how does name node writes the edit log into his memory? For that name node needs a reboot. a checkpoint process. What happens is when name node is rebooted, it writes the data in its memory to the disk  into the file and the file is called fsimage and then its going to merge the journal to the disk.

But in production environment, we don't reboot the name node so often, so the edit logs will keep on growing and growing. If we lose the name node in a disaster then we will also lose all the changes.

That's why it is nice to have secondary name node.
Secondary name node takes all the responsibility of merging the edit log with the fsimage file. It periodically will go to name node and say hey give me your edit logs. The secondary name node will have the most recent fsimage. its gone merge to together and reupload to the name node. So when the name node reboots, it will be updated.

Moving in to the DATA NODES
A Data Node stores data in the HadoopFileSystem. A functional file system has more than one Data Node, with data replicated across them.On startup, a DataNode connects to the Name Node; spinning until that service comes up. It then responds to requests from the Name Node for file system operations.Client applications can talk directly to a Data Node, once the Name Node has provided the location of the data.

Data node send heartbeat to Name node every 3 sec. If the Data node doesnt send heartbeat for 10 minutes, name node will consider it as a dead node. Thats when its going to take all the blocks on it and re-replicate across the clusters.

Speaking of Blocks, Blocks are 64mb ( default) size. Block Placement is very important for bandwidth issues.
Replication management will tell us that we are either under replicated or over replicated. if its over replicated, a block will marked to be removed. if its under replicated, then it will create a priority queue. things with least amount of blocks in the cluster will be on top of the queue.

In the end, lets see some tools that we can see to interact with HDFS

From the user prospective, we will use FSshell




Popular posts from this blog

Data Center Migration

Note: This blog is written with the help of my friend Rajanikanth
Data Center Migrations / Data Center Consolidations
Data Center Consolidations, Migrations are complex projects which impact entire orgnization they support. They usually dont happen daily but once in a decade or two. It is imperative to plan carefully, leverage technology improvements, virtualization, optimizations.
The single most important factor for any migration project is to have high caliber, high performing, experienced technical team in place. You are migrating business applications from one data center to another and there is no scope for failure or broken application during migration. So testing startegy should be in place for enterprise business applications to be migrated.
Typical DCC and Migrations business objectives
Business Drivers
·Improve utilization of IT assets ·DC space & power peaked out - business growth impacted ·Improve service levels and responsiveness to new applications ·Reduce support complexi…

HP CSA Implementation

I know the above picture is little confusing but don’t worry I break it down and explain in detail. By the time I am done explaining you all will be happy. HARDWARE AND SOFTWARE REQUIREMENTS 1.VMware vSphere infrastructure / Microsoft Hyper V: For the sake of Simplicity we will use VMware vSphere. We Need vSphere 4.0 /5/5.5 and above and vCenter 4.0 and above ready and installed. This is the first step. 2.We need Software medias for HP Cloud Service Automation, 2.00, HP Server Automation, 9.02, HP Operations Orchestration (OO)9.00.04, HP Universal CMDB 9.00.02, HP Software Site Scope, 11.01,HP Insight Software6.2 Update 1 3.DNS, DHCP and NTP systems are already installed and configured. NTP information should be part of VM templates 4.SQL Server 2005 or Microsoft® SQL Server 2008 or Microsoft® SQL Server 2012 , Oracle 11g, both 32-bit and 64-bit versions may be used for CSA database.
5.We will install  HP Cloud Service Automation, 2.00, HP Server Automation, 9.02, HP Operations Orchestra…

Openstack- Its importance in Cloud. The HP Helion Boost

Every enterprise expects few things from cloud computing, mainly:

· Auto scaling: The workload should increase and decrease as needed by the IT environment.

· Automatic repair: If there is any fault or crash of the application or the server, it automatically fix it

· Fault tolerant: The application or underlying technology is intelligent enough to make itself fault torrent

· Integrated lifecycle: It should have integrated lifecycle

· Unified management: Its easy to manage all different aspects of technology

· Less cost

· Speed


Its year 2014. till now only 5% to 7% enterprises are using cloud computing. Such a small number. Its a huge opportunity and a vast majority for anyone who is interested in providing cloud computing services.
Current IT environment is very complex. You just cant solve all your problems with cloud computing.
There are legacy systems, databases, data processors, different hardware and software. You name it , there are so many technology available in just o…