Skip to main content

Hadoop Overview


Hadoop Overview

Hadoop File System was developed using distributed file system design. It is run on commodity hardware. Unlike other distributed systems, HDFS is highly fault tolerant and designed using low-cost hardware. 


 HDFS holds very large amount of data and provides easier access. To store such huge data, the files are stored across multiple machines. These files are stored in redundant fashion to rescue the system from possible data losses in case of failure. HDFS also makes applications available to parallel processing.

The important features of HDFS are 
  1. HDFS is mainly suited for distributed storage and processing of the data.
  2. Streaming access to file system data.
  3. HDFS follows master-slave architecture
  4. Namenode and Datanode help users to easily check the heartbeat of the nodes in the cluster
Namenode:
  1. Manages the file system namespace.
  2. Regulates client’s access to files.
  3. It also executes file system operations such as renaming, closing, and opening files and directories.
Datanode:

Following are the tasks of Datanode
  1. Datanode performs read/write operations
  2. perform operations such as block creation, deletion and replication

Namenode is a software that runs on commodity hardware which acts as master server across the cluster

Following are the tasks of namenode
The Datanode is a software that runs on commodity hardware, system which has datanode acts as slave across the cluster