Hadoop interview Questions and Answers

Hadoop interview Questions and Answers
 1)How JobTracker schedules a task?
 2)What is a Task Tracker in Hadoop? How many instances of TaskTracker run on a Hadoop Cluster?
 3)What is a JobTracker in Hadoop? How many instances of JobTracker run on a Hadoop Cluster?
 4)What is a Task instance in Hadoop? Where does it run?
 5)How many Daemon processes run on a Hadoop system?
6)What is configuration of a typical slave node on Hadoop cluster? How many JVMs run on a slave node?
7)What is the difference between HDFS and NAS ?
8)How NameNode Handles data node failures?
9)Does MapReduce programming model provide a way for reducers to communicate with each other? In a MapReduce job can a reducer communicate with another reducer?
10)Can I set the number of reducers to zero?
11)Where is the Mapper Output (intermediate kay-value data) stored ?
12)What are combiners? When should I use a combiner in my MapReduce Job?
13)What is Writable & WritableComparable interface?
14)What is the Hadoop MapReduce API contract for a key and value Class?
15)What is a IdentityMapper and IdentityReducer in MapReduce ?
16)What is the meaning of speculative execution in Hadoop? Why is it important?
17)When is the reducers are started in a MapReduce job?
18)If reducers do not start before all mappers finish then why does the progress on MapReduce job shows something like Map(50%) Reduce(10%)? Why reducers progress percentage is displayed when mapper is not finished yet?
19)What is HDFS ? How it is different from traditional file systems?
20)What is HDFS Block size? How is it different from traditional file system block size?
21)What is a NameNode? How many instances of NameNode run on a Hadoop Cluster?
22)What is a DataNode? How many instances of DataNode run on a Hadoop Cluster?
23)How the Client communicates with HDFS?
24)How the HDFS Blocks are replicated?


Post a Comment