Hadoop Online Training

Hadoop Online Training By IT Experts :

IQ training facility offers Hadoop Online Training. Our Hadoop trainers come with vast work experience and teaching skills. Our Big Data Hadoop training online is regarded as the one of the Best online training in India. All our students were happy and able to find Jobs quickly in USA, UK, Singapore, Japan, Europe. Hadoop training online is your one stop solution to learn Hadoop at the comfort of your home with flexible schedules.

Hadoop Introduction :

Hadoop is an open source software.  Hadoop  allows distributed processing of the scattered large sets of data  across batch of computer  servers using simple programming methods. It is outlined to scale up from a single server to thousands of machines, with a very high availability. offers local computation and storage. Rather than depending on hardware, the flexibility of these batches comes from the software’s capability to detect and handle failures at the application layer. This course helps you through address the challenges and take advantage of the core values provided by Hadoop in a vendor neutral way.  

IQ Training offers the Hadoop Online Course in a true global setting. 

 

Hadoop Online Training Concepts :

HADOOP BASICS

1.The Motivation for Hadoop

1.Problems with traditional large-scale systems

1.Data Storage literature survey

2.Data Processing literature Survey

3.Network Constraints

2.Requirements for a new approach

3.Hadoop: Basic Concepts

1.What is Hadoop?

2.The Hadoop Distributed File System

3.Hadoop Map Reduce Works

4.Anatomy of a Hadoop Cluster

4.Hadoop demons

1.Master Daemons

1.Name node

2.Job Tracker

3.Secondary name node

2.Slave Daemons

1.Job tracker

2.Task tracker

5.HDFS(Hadoop Distributed File System)

1.Blocks and Splits

1.Input Splits

2.HDFS Splits

2.Data Replication

1.Hadoop Rack Aware

3.Data high availability

4.Cluster architecture and block placement

CASE STUDIES

Programming Practices & Performance Tuning

1.Developing MapReduce Programs in

1.Local Mode

Running without HDFS

2.Pseudo-distributed Mode

Running all daemons in a single node

3.Fully distributed mode

Running daemons on dedicated nodes

Hadoop Administration :

 

Setup Hadoop cluster of Apache, Cloudera, Hortonworks, Greenplum

1.Make a fully distributed Hadoop cluster on a single laptop/desktop

2.Install and configure Apache Hadoop on a multi node cluster in lab.

3.Install and configure Cloudera Hadoop distribution in fully distributed mode

4.Install and configure Horton Works Hadoop distribution in fully distributed mode

5.Install and configure Green Plum distribution in fully distributed mode

6.Monitoring the cluster

7.Getting used to management console of Cloudera and Horton Works

8.Name Node in Safe mode

9.Meta Data Backup

10.Ganglia and Nagios – Cluster monitoring

11.CASE STUDIES

 Hadoop Development :

      Writing a MapReduce Program

1.Examining a Sample MapReduce Program

With several examples

2.Basic API Concepts

3.The Driver Code

4.The Mapper

5.The Reducer

6.Hadoop’s Streaming API
Performing several Hadoop jobs

1.The configure and close Methods

2.Sequence Files

3.Record Reader

4.Record Writer

5.Role of Reporter

6.Output Collector

7.Counters

8.Directly Accessing HDFS

9.ToolRunner

10.Using The Distributed Cache

Several MapReduce jobs (In Detailed)

1.MOST EFFECTIVE SEARCH USING MAPREDUCE

  2.GENERATING THE RECOMMENDATIONS USING MAPREDUCE

3.PROCESSING THE LOG FILES USING MAPREDUCE

4.Identity Mapper

5.Identity Reducer

6.Exploring well known problems using MapReduce applications

Debugging MapReduce Programs

1.Testing with MRUnit

2.Logging

3.Other Debugging Strategies.

Advanced MapReduce Programming

1. The Secondary Sort

2.Customized Input Formats and Output Formats

3.Joins in MapReduce

Monitoring and debugging on a Production Cluster

1.Counters

2.Skipping Bad Records

3.Running in local mode

Tuning for Performance in MapReduce

1.Reducing network traffic with combiner

2.Partitioners

3.Reducing the amount of input data

4.Using Compression

5.Reusing the JVM

6.Running with speculative execution

7.Other Performance Aspects

8.CASE STUDIES

CDH4 Enhancements :

1.Name Node High – Availability

2.Name Node federation

3.Fencing

4.MapReduce Version – 2

HADOOP ANALYST

Hive

1.Hive concepts

2.Hive architecture

3.Install and configure hive on cluster

4.Different type of tables in hive

5.Hive library functions

6.Buckets

7.Partitions

8.Joins in hive

1.Inner joins

2.Outer Joins

9.Hive UDF

PIG

1.Pig basics

2.Install and configure PIG on a cluster

3.PIG Library functions

4.Pig Vs Hive

5.Write sample Pig Latin scripts

6.Modes of running PIG

1.Running in Grunt shell

2.Running as Java program

7.PIG UDFs

8.Pig Macros

9.Debugging PIG

IMPALA

1.Difference between Impala Hive and Pig

2.How Impala gives good performance

3.Exclusive features of Impala

4.Impala Challenges

5.Use cases of Impala

 

NOSQL

1.HBase

2.HBase concepts

3.HBase architecture

4.HBase basics

5.Region server architecture

6.File storage architecture

7.Column access

8.Scans

9.HBase use cases

10.Install and configure HBase on a multi node cluster

11.Create database, Develop and run sample applications

12.Access data stored in HBase using clients like Java, Python and Pearl

13.Map Reduce client to access the HBase data

14.HBase and Hive Integration

15.HBase admin tasks

16.Defining Schema and basic operation.

17.Cassandra Basics

18.MongoDB Basics

 

Other EcoSystem Components

1.Sqoop

2.Install and configure Sqoop on cluster

3.Connecting to RDBMS

4.Installing Mysql

5.Import data from Oracle/Mysql to hive

6.Export data to Oracle/Mysql

7.Internal mechanism of import/export

 

Oozie

1.Oozie architecture

2.XML file specifications

3.Install and configuring Oozie and Apache

4.Specifying Work flow

5.Action nodes

6.Control nodes

7.Oozie job coordinator

 

Flume, Chukwa, Avro, Scribe, Thrift

1.Flume and Chukwa concepts

2.Use cases of Thrift, Avro and scribe

3.Install and configure flume on cluster

4.Create a sample application to capture logs from Apache using flume

Hadoop Challenges

1.Hadoop disaster recovery

2.Hadoop suitable cases

 

      HIGHLIGHTS

>100% CERTIFICATION ASSURANCE

>BIG DATA UNIVERSITY(IBM) CERTIFICATION FREE
> TECHNICAL SUPPORT

> INTERVIEW QUESTIONS

> SAMPLE RESUMES