MODE OF TRAINING : CLASSROOM / ONLINE TRAINING

 

 

 

 HADOOP TRAINING COURSE CONTENT

MODULE 1- INTRODUCTION TO BIGDATA

  • What is BigData
  • how did data become so big
  • why BigData deserves your attention-
  • use cases of big data
  • Different option of analyzing big data.
  • How can such a huge data are analyzed.

MODULE 2- INTRODUCTION TO HADOOP

  • What is Hadoop,
  • History of Hadoop
  • How Hadoop name was given
  • Problems with Traditional Large-Scale Systems and Need for Hadoop
  • Where Hadoop is being used
  • Understanding distributed systems and Hadoop
  • RDBMS and Hadoop

MODULE 3- STARTING HADOOP

  • Setup single node hadoop cluster
  • Configuring Hadoop
  • Understanding Hadoop Architecture
  • Understanding Hadoop configuration files
  • Hadoop Components- HDFS, Map Reduce
  • Overview of Hadoop Processes
  • Overview of Hadoop Distributed File System
  • Name nodes
  • Data nodes
  • The Command-Line Interface
  • The building blocks of Hadoop
  • Setting up SSH for a Hadoop cluster
  • Running Hadoop
  • Web-based cluster UI-NameNode UI, MapReduce UI
  • Hands-On Exercise: Using HDFS commands

MODULE 4- UNDERSTANDING MAPREDUCE

  • How MapReduce Works
  • Data flow in MapReduce
  • Map operation
  • Reduce operation
  • MapReduce Program In JAVA using Eclipse
  • Counting words with Hadoop—Running your first program
  • Writing MapReduce Drivers, Mappers and Reducers in Java
  • Real-world “MapReduce” problems
  • Hands-On Exercise: Writing a MapReduce Program and Running a MapReduce Job
  • Java WordCount Code Walkthrough

MODULE 5- HADOOP ECOSYSTEM

  • Hive
  • Sqoop
  • Pig
  • HBase
  • Flume

MODULE 6- EXTENDED SUBJECTS ON HIVE

  • Installing Hive
  • Introduction to Apache Hive
  • Getting data into Hive
  • Hive’s architecture
  • Hive-HQL
  • Query execution
  • Programming Practices and projects in Hive
  • Troubleshooting
  • Hands-On Exercise: Hive Programming

MODULE 7- EXTENDED SUBJECTS ON SQOOP

  • Installing Sqoop
  • Configure Sqoop
  • Import RDBMS data to Hive using Sqoop
  • Export from to Hive to RDBMS using Sqoop
  • Hands-On Exercise: Import data from RDBMS to HDFS and Hive
  • Hands-On Exercise: Export data from HDFS/Hive to RDBM

MODULE 8- EXTENDED SUBJECTS ON PIG

  • Introduction to Apache Pig
  • Install Pig
  • Pig architecture
  • Pig Latin – Reading and writing data using Pig
  • Hands-On Exercise: Programming with pig, Load data, execute data processing statements.

MODULE 9- EXTENDED SUBJECTS ON

  • What is HBase?
  • Install HBase
  • HBase Architecture
  • HBase API
  • Managing large data sets with HBase

MODULE 10- SETUP MULTI-NODE HADOOP CLUSTER

  • Setup multi node hadoop cluster using CentOS dump.

MODULE 11- FLUME

MODULE 12- ADVANCED MAP/REDUCE-

  • Map Reduce API
  • Combiner, partitioner
  • Custom Data Types
  • Input Formats
  • Output Formats
  • Common MapReduce Algorithms
  • Sorting
  • Searching
  • Indexing

MODULE 13- ADVANCED HADOOP CONCEPT

  • Authentication in hadoop,
  • Administration best practices
  • Hardware selection for master nodes (NameNode, Job Tracker, HBase Master)
  • Hardware selection for slave nodes (Data Nodes, Task Trackers, and Region Serv­ers)
  • Cluster growth plan based on storage

MODULE 14- SUMMARY

  • Case studies
  • Sample Applications
  • References 

MODULE 15- TEST

CONTACT ME
CONTACT ME
close-link
Hi how can I help you
Scroll to Top