· This test contains 25 questions. Section A contains 5 objective type questions, section B contains 15 brief answer type questions and section C contains 5 long answer type questions.
· Candidates are required to attempt all the questions and send the answers’ document to firstname.lastname@example.org, mentioning their name, enrollment number and module number before their examinations.
· This test should be attempted only after thorough study of the module.
For every question, 4 options are given out of which only one option is correct. Choose the correct option
1. As companies move past the
experimental phase with Hadoop, many cite the need for additional capabilities,
a) Improved data storage and information retrieval
b) Improved extract, transform and load features for data integration
c) Improved data warehousing functionality
d) Improved security, workload management and SQL support
2. Point out the correct statement :
a) Hadoop do need specialized hardware to process the data
b) Hadoop 2.0 allows live stream processing of real time data
c) In Hadoop programming framework output files are divided in to lines or records
d) None of the mentioned
3. According to analysts, for what can
traditional IT systems provide a foundation when they’re integrated with big
data technologies like Hadoop?
a) Big data management and data mining
b) Data warehousing and business intelligence
c) Management of Hadoop clusters
d) Collecting and storing unstructured data
4. Hadoop is a framework that works
with a variety of related tools. Common cohorts include:
a) MapReduce, Hive and HBase
b) MapReduce, MySQL and Google Apps
c) MapReduce, Hummer and Iguana
d) MapReduce, Heron and Trumpet
5. Point out the wrong statement :
a) Hardtop processing capabilities are huge and it’s real advantage lies in the ability to process terabytes & petabytes of data
b) Hadoop uses a programming model called “MapReduce”, all the programs should confirms to this model in order to work on Hadoop platform
c) The programming model, MapReduce, used by Hadoop is difficult to write and test
d) All of the mentioned
Answer the following questions briefly in not more than 200 words
1. What are the basic differences between relational database and HDFS?
2. What is Hadoop and its components.
3. What are HDFS and YARN?
4. Write about the various Hadoop daemons and their roles in a Hadoop cluster.
5. Compare HDFS with Network Attached Storage (NAS).
6. List the difference between Hadoop 1 and Hadoop 2.
7. Why does one remove or add nodes in a Hadoop cluster frequently?
8. What will you do when NameNode is down?
9. Why do we use HDFS for applications having large data sets and not when there are a lot of small files?
10. How do you define “block” in HDFS? What is the default block size in Hadoop 1 and in Hadoop 2? Can it be changed?
11. What does ‘jps’ command do?
12. What is “speculative execution” in Hadoop?
13. What is the difference between an “HDFS Block” and an “Input Split”?
14. Name the three modes in which Hadoop can run.
15. What does a “MapReduce Partitioner” do?
Answer the following questions in detail (800 words)
1. What are the applications of Hadoop
2. Write a note on Hdfs
3. What are the Mapreduce features
4. Write a note on Hadoop Cluster
5. Write a note on Hadoop Architecture