Hadoop

Techieventures offer best training on Big Data Hadoop. Hadoop is an open source distributed processing Framework that manages data processing and storage for Big Data applications running in cluster system. We deliver real time project in Supply Chain Management, Healthcare, BFSI, warehouse management and many more. We have been ranked best hadoop training institute in btm & Bangalore.

Course Content

  • Introduction To Big Data Hadoop

  • Introduction to Big Data and It’s Sources
  • Types of Data
  • Introduction to Hadoop
  • Hadoop Eco Systems
  • Basics of HDFS, MapReduce,Hive, Pig, Hbase, Sqoop, Flume, Zookeeper, YARN, Oozie Hadoop History & It’s Significance
  • HDFS

  • Hadoop Basic Terminologies
  • Concepts of Distributed File System
  • HDFS Block Concepts
  • When to use Hadoop and when Not to use Hadoop
  • HDFS Architecture And Storage Mechanism
  • In-Depth concepts Hadoop Daemons & Their Functions
  • Replication Factor Concepts of Hadoop
  • Reading & Writing of Files in Hadoop
  • Basic Processing Concepts of Mapreduce
  • Data Flow in Hadoop
  • Anatomy of File READ and WRITE- Block Level Concepts
  • In depth concepts of HDFS storage.
  • Heartbeat Signal of Hadoop
  • HDFS PRACTICALS (Data storage & Processing)
  • Real Use Case
  • MAPREDUCE

  • Introduction to MapReduce
  • Architecture of MapReduce
  • Architectural flow of MapReduce Processing
  • Concepts of Blocks and Input Splits - Relationship
  • Data Locality Optimisation
  • Programming with MapReduce.
  • MapReduce program Life Cycle
  • Combiners & Partitioners
  • MapReduce Real Time Programming
  • YARN architecture

  • Necessity of Hadoop Upgradation
  • Hadoop 2.0 Architecture
  • Name Node High Availability
  • HDFS Federation
  • Failover and Fencing Mechanism
  • MapReduce2
  • Core Concepts of YARN
  • Upgrading Existing MRv1 code to MRv2
  • Single node cluster

  • Introduction to Different Modes of Hadoop Configuration
  • Choosing OS for the Hadoop Cluster
  • SSH concepts & Configuration
  • Installing Java
  • Creating Hadoop User
  • Understanding the Hadoop configuration File
  • Setting Up The Single Node Cluster
  • Multi node cluster

  • Choosing Hadoop Cluster Hardware
  • Choosing Hadoop Distributions
  • Setting Up Multi Node Cluster
  • Hadoop Security
  • HIVE

  • Introduction to HIVE
  • Hive Archietecture
  • Downloading and configuring Hive
  • Running Hive and executing Hive Queries
  • Concepts of Hive Execution Engine
  • Comparison with Traditional Database
  • HiveQL: Data Types, Operators and Functions
  • Types of Hive Tables : Internal & External
  • Partitioning Concepts of Hive
  • Hive Bucketing
  • PIG

  • Introduction to Pig
  • Comparison with Traditional Databases
  • User Defined Functions
  • Filters
  • Data Processing operators
  • Load And Store Function
  • Developing and Testing Pig Latin Scripts
  • Non Linear Data Flow
  • Execution Control in PIG
  • Making PIG FLY
  • Data Layout Optimisation
  • Bad Record Handling
  • SQOOP

  • Introduction to Sqoop
  • Sqoop Architecture
  • Downloading and Configuring Sqoop
  • Exposing Sqoop Tools
  • 30 Basic Sqoop Import Export Cases
  • Incremental Import
  • Importing Data by joining multiple tables
  • Using Custom Boundaries Queries in Sqoop
  • Types of Export
  • Use cases of Export
  • Importing Data Directly into HIVE and HBASE
  • Sqoop Integration with Hadoop Eco System Components
  • Query Scheduling and Automation
  • Introduction of FLUME
  • FLUME components(SOURCE, CHANNELS, SINKS, CHANNEL SELECTOR, AGENT,EVENT, SINK PROCESSORS)
  • Flume Architecture
  • Downloading and configuring Flume
  • Flume API Concepts
  • Topology Design Considerations
  • Configuration & Installation
  • HBase Architecture
  • HBase versus RDBMS
  • Schema Design
  • Data Migration
  • Data Management - Backup & Restore
  • HBase - Maintenance & Security
  • Performance Tuning
  • Installing & Running Zookeeper
  • The ZooKeeper Service: Data Model
  • Services
  • States
  • Sessions
  • Consistency
  • Application Building
  • Zookeeper in Production
  • Introduction to Cassandra
  • Installing Cassandra
  • Creating Users Table
  • Inserting and Selecting Data
  • Working With Status Updates
  • Anatomy of Compounded Primary Key
  • Beyond Key Value Lookup
  • Partitions
  • Modeling Relationships
  • Learning the Normalized Approach
  • Partial Denormalization
  • Full Denormalization
  • Expanding the Data Model
  • Altering table structure
  • Adding Nodes to Hadoop cluster
  • Removing Nodes from Hadoop cluster
  • Load Balancing
  • Setting up Racks for Hadoop
  • Understanding the Cluster Topology
  • Resource Distribution And Allocation
  • Concepts and Advantages of Hadoop Streaming
  • Executing Python codes in Hadoop
  • Execution R codes in Hadoop
  • Oozie introduction
  • Installation and Configuration
  • Understanding Workflow, Actions & Action Types
  • Workflow Applications
  • Oozie Bundle and Bundle State Transition
  • Operation in Oozie

Other Courses

Download PDF
Please enter valid Name
Please enter valid Email
Please enter valid Phone Number