Professional Documents
Culture Documents
25 - Map Reduce Recap
25 - Map Reduce Recap
25 - Map Reduce Recap
RETRIEVAL
Index Construction
Bhaskarjyoti Das
Department of Computer Science Engineering
ALGORITHMS FOR INFORMATION
RETRIEVAL
Bhaskarjyoti Das
Department of Computer Science Engineering
ALGORITHMS FOR INFORMATION RETRIEVAL
Working with Large Data
HDFS Master
Client
C0 C1 C1 C0 C5
C5 C2 C5 C3 … C2
• The partitioning phase takes place after the map phase and
before the reduce phase.
• The number of partitions is equal to the number of
reducers.
• The data gets partitioned across the reducers according
to the partitioning function .
• Often a MAP task will produce many pairs of the form
(k1,v1), (k2, v2) ..for same key K
• Can save network time by pre-aggregating the values in
the mapper..combiner is usually same as Reducer function
• Benefit : much less data need to be copied and shuffled
ALGORITHMS FOR INFORMATION RETRIEVAL
MAP REDUCE Conclusion
Bhaskarjyoti Das
Department of Computer Science Engineering