What is hadoop?

What is hadoop?
• Apache Hadoop is a software framework (open source) which promotes data-intensive distributed applications.
• The entire Hadoop platform consists of Hadoop kernal, MapReduce component, HDFS (Hadoop distributed file system)
• Hadoop is written in the Java programming language and is a top-level Apache project being built and used by a global community of contributors.
• The most well known technology used for Big Data is Hadoop
• Two languages are identified as original Hadoop languages: PIG and Hive.
• In hadoop system, the data is distributed in thousands of nodes parallely 
• Hadoop deals with complexities of high volume, velocity & variety of data
• Batch processing centric is greatly achieved in Hadoop
• Hadoop can store petabytes of data reliably
• Accessibility is ensured even if any machine breaks down or is thrown out from network.
• One can use Map Reduce programs to access and manipulate the data. The developer need not worry where the data is stored, he/she can reference the data from a single view provided from the Master Node which stores all metadata of all the files stored across the cluster.

0 comments:

Post a Comment