简体繁体中英

MapReduce: How to pass HashMap to mappers

原文 2017-02-16 11:00:22 8 1 java/ hadoop/ apache-spark/ mapreduce/ spark-streaming

I'm designing the new generation of an analysis system which needs to process many events from many sensors in near-real time. And to do that I want to use one of the Big Data Analytics platforms such as Hadoop , Spark Streaming or Flink .

In order to analyze each event I need to use some meta-data from a table (DB) or at-least load it into a cached map.

The problem is that each mapper is going to be parallelized on several nodes.

So I have two things to handle:

First, how to load/pass a HashMap to a mapper?
Is there any way to keep the HashMap Consistent between the mappers?

1 answers

Serialize HashMap structure to file, store it in HDFS and at the MapReduce job configuration phase use DistributedCache to spread file with serialized HashMap across all the mappers. Then at map phase each mapper can read the file, de-serialize and then access this HashMap.

How to tell MapReduce how many mappers to use?

How to tell MapReduce how many mappers to use at the same time?

How to pass a file as parameter in mapreduce

Hadoop mapreduce : Driver for chaining mappers within a MapReduce job

How to pass result of MapReduce into another MapReduce (java and hadoop)

How to Store the result of my MapReduce Job in a Hashmap and sorted it by value?

How to pass HashMap value to list

How to pass hashmap value to a function

How to pass in initialized HashMap as param?

Configuring memory for mappers and reducer during mapreduce job submission

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question How to tell MapReduce how many mappers to use? How to tell MapReduce how many mappers to use at the same time? How to pass a file as parameter in mapreduce Hadoop mapreduce : Driver for chaining mappers within a MapReduce job How to pass result of MapReduce into another MapReduce (java and hadoop) How to Store the result of my MapReduce Job in a Hashmap and sorted it by value? How to pass HashMap value to list How to pass hashmap value to a function How to pass in initialized HashMap as param? Configuring memory for mappers and reducer during mapreduce job submission

Related Tags

MapReduce: How to pass HashMap to mappers

Question

1 answers

solution1 0 2017-02-20 21:09:44

solution1
0 2017-02-20 21:09:44