Professional Documents
Culture Documents
Big Data and Hadoop Details
Big Data and Hadoop Details
Big Data and Hadoop Details
Big Data
Storage
Processing
Hadoop
Data Bases
Traditional
NO SQL
Hadoop
o What is Hadoop
o History of Hadoop
o Why Hadoop
o Why Hadoop
o Name Node
o Data Node
Accessing HDFS
o CLI(Command Line Interface) using hdfs commands
HDFS Commands
Configurations
o Importance of configurations
MAPREDUCE
o Jobtracker
Importance of JobTracker
Importance of TaskTracker
o DBInputFormat
o DBOutputFormat
Mapper
Reducer
Combiner
Partitioner
Distributed Cache
Counters
o What is Counter in Map Reduce Job
Joins
What is the difference between Map Side join and Reduce Side Join
Compression techniques
o Compression Types
o Compression Codecs
o FIFO Scheduler
o Capacity Scheduler
o Fair Scheduler
o What is YARN
Data Locality
Speculative Execution
Configurations
o Importance of configurations
o Writing Unit Tests for Map Reduce Jobs
Apache PIG
o Local Mode
Execution Mechanism
o Grunt Shell
o Script
o Embedded
UDF's
Filter's
Load Functions
Store Functions
o Transformations in Pig
Apache Hive
Hive Introduction
Hive commands
Hive architecture
o Driver
o Compiler
o Semantic Analyzer
o SQL VS Hive QL
Hive Services
o CLI
o Hiveserver
o Hwi
Metastore
o embedded metastore configuration
UDF's
UDAF's
UDTF's
o Transformations in Pig
Partitions
Buckets
SerDe
Apache Zookeeper
Introduction to zookeeper
Apache HBase
Hbase introduction
Hbase basics
o Column families
o Scans
Hbase installation
o Local mode
o Psuedo mode
o Cluster mode
Hbase Architecture
o Storage
o WriteAhead Log
o Block cache
Mapreduce integration
Hbase Usage
o Key design
o Bloom Filters
o Versioning
o Coprocessors
o Filters
Hbase Clients
o REST
o Thrift
o Hive
o Web Based UI
Hbase Admin
o Schema definition
Apache Sqoop
Introduction to Sqoop
Sqoop Installation
Apache Flume
Introduction to flume
Flume installation
Introduction to oozie
Oozie installation
Hadoop Distributions :