简体繁体中英

Hadoop and map-reduce on multicore machines

原文 2012-09-29 23:41:58 2 3 hadoop/ multicore

I have read a lot about Hadoop and Map-Reduce running on clusters of machines. Does some one know if the Apache distribution can be run on an SMP with several cores. In particular, can multiple Map-Reduce processes be run on the same machine. The scheduler will take care of spreading them across multiple cores. Thanks. - KG

3 answers

Yes. You have multiple map and reduce slots in each machine which are determined by the RAM and CPU (each JVM instance needs 1GB by default so a 8GB machine with 16 cores should still have 7 task slots)

from hadoop wiki

Use the configuration knob: mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum to control the number of maps/reduces spawned simultaneously on a TaskTracker. By default, it is set to 2, hence one sees a maximum of 2 maps and 2 reduces at a given instance on a TaskTracker.

You can set those on a per-tasktracker basis to accurately reflect your hardware (ie set those to higher nos. on a beefier tasktracker etc.).

You can use those lightweight MapReduce frameworks for multicore computers.

For example

LeoTask: A lightweight, productive, and reliable mapreduce framework for multicore computers

https://github.com/mleoking/LeoTask

For Apache Hadoop 2.7.3, my experience has been that enabling YARN will also enable multi-core support. Here is a simple guide for enabling YARN on a single node:

https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-common/SingleCluster.html#YARN_on_a_Single_Node

The default configuration seems to work pretty well. If you want to tune your core usage, then perhaps look into setting 'yarn.scheduler.minimum-allocation-vcores' and 'yarn.scheduler.maximum-allocation-vcores' within yarn-site.xml ( https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml )

Also, see here for instructions on how to configure a simple Hadoop sandbox with multicore support: https://bitbucket.org/aperezrathke/hadoop-aee

Map-reduce hadoop error

Hadoop Map-Reduce . RecordReader

Hadoop map-reduce programming

Running a Hadoop Map-Reduce Job

Benchmarking Hadoop Map-Reduce application

Query related to Hadoop's map-reduce

Combining results from hadoop map-reduce

Hadoop Map-Reduce Output File Exception

Hadoop Map-reduce programming syntax error

Grouping joined data in Hadoop map-reduce

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Map-reduce hadoop error Hadoop Map-Reduce . RecordReader Hadoop map-reduce programming Running a Hadoop Map-Reduce Job Benchmarking Hadoop Map-Reduce application Query related to Hadoop's map-reduce Combining results from hadoop map-reduce Hadoop Map-Reduce Output File Exception Hadoop Map-reduce programming syntax error Grouping joined data in Hadoop map-reduce

Related Tags

Hadoop and map-reduce on multicore machines

Question

3 answers

solution1
8 2012-09-30 07:42:21

solution2
0 2015-04-19 13:23:02

solution3
0 2016-11-17 20:47:02

Hadoop and map-reduce on multicore machines

Question

3 answers

solution1 8 2012-09-30 07:42:21

solution2 0 2015-04-19 13:23:02

solution3 0 2016-11-17 20:47:02

solution1
8 2012-09-30 07:42:21

solution2
0 2015-04-19 13:23:02

solution3
0 2016-11-17 20:47:02