Online Training



Big Data
Limitations and Solutions of existing Data Analytics Architecture
Hadoop Features
Hadoop Ecosystem
Hadoop 2.x core components
Hadoop Storage: HDFS
Hadoop Processing: MapReduce Framework
Hadoop Different Distributions  

Hadoop 2.x Cluster Architecture - Federation and High Availability
A Typical Production Hadoop Cluster
Hadoop Cluster Modes
Common Hadoop Shell Commands
Hadoop 2.x Configuration Files
Single node cluster and Multi node cluster set up Hadoop Administration 

MapReduce Use Cases
Traditional way Vs MapReduce way
Why MapReduce
Hadoop 2.x MapReduce Architecture
Hadoop 2.x MapReduce Components
YARN MR Application Execution Flow
YARN Workflow
Anatomy of MapReduce Program
Demo on MapReduce
Input Splits
Relation between Input Splits and HDFS Blocks
MapReduce: Combiner & Partitioner
Demo on de-identifying Health Care Data set
Demo on Weather Data set 

Distributed Cache
Reduce Join
Custom Input Format
Sequence Input Format
Xml file Parsing using MapReduce 

About Pig
MapReduce Vs Pig
Pig Use Cases
Programming Structure in Pig
Pig Running Modes
Pig components
Pig Execution
Pig Latin Program
Data Models in Pig
Pig Data Types
Shell and Utility Commands
Pig Latin : Relational Operators
File Loaders, Group Operator
COGROUP Operator
Joins and COGROUP
Diagnostic Operators
Specialized joins in Pig
Built In Functions ( Eval Function, Load and Store Functions, Math function, String Function, Date Function, Pig UDF, Piggybank, Parameter Substitution ( PIG macros and Pig Parameter substitution )
Pig Streaming
Testing Pig scripts with Punit
Aviation use case in PIG
Pig Demo on Healthcare Data set  

Hive Background
Hive Use Case
About Hive
Hive Vs Pig
Hive Architecture and Components
Metastore in Hive
Limitations of Hive
Comparison with Traditional Database
Hive Data Types and Data Models
Partitions and Buckets
Hive Tables(Managed Tables and External Tables)
Importing Data
Querying Data
Managing Output
Hive Script
Hive UDF
Retail use case in Hive
Hive Demo on Healthcare Data set 

Lorem Ipsum dolor sit amet, consectetur adipisicing elit. Voluptatibus, voluptas explicabo molestiae tempore natus velit sed aliquam ut! Culpa asperiores, error ullam qui! Commodi nobis distinctio aperiam totam perferendis quas.

HBase Data Model
HBase Shell
HBase Client API
Data Loading Techniques
ZooKeeper Data Model
Zookeeper Service
Demos on Bulk Loading
Getting and Inserting Data
Filters in HBase 

What is Apache Spark
Spark Ecosystem
Spark Components
History of Spark and Spark Versions/Releases
Spark a Polyglot
What is Scala?
Why Scala?

Flume and Sqoop Demo
Oozie Components
Oozie Workflow
Scheduling with Oozie
Demo on Oozie Workflow
Oozie Co-ordinator
Oozie Commands
Oozie Web Console
Oozie for MapReduce
Hive, and Sqoop
Combine flow of MR
Hive in Oozie
Hadoop Project Demo
Hadoop Integration with Talend



Mr Ajith Kumar
Enterprise Architect | Big Data Consultant | Analytics SME

► 22 Yrs of Technology & Industry Experience
► Data Science and Machine Learning Consultant
► Center Of Excellence member for SOA & Big Data
► Telecom Consultant & SME for SDP, BPM, EMM & Big Data
► Strategy & IT Transformation Consultant for Telecom & Banking