简体   繁体   English

如何为hadoop mapreduce配置java内存堆空间?

[英]How to configure java memory heap space for hadoop mapreduce?

I've tried to run a mapreduce job on about 20 GB data, and I got an error on reduce shuffle phase.我尝试对大约 20 GB 的数据运行 mapreduce 作业,但在 reduce shuffle 阶段出现错误。 It says that because of memory heap space.它说是因为内存堆空间。 Then, I've read on many source, that I have to decrease the mapreduce.reduce.shuffle.input.buffer.percent property on mapred-site.xml with the default value 0,7.然后,我阅读了许多来源,我必须使用默认值 0,7 减少 mapred-site.xml 上的 mapreduce.reduce.shuffle.input.buffer.percent 属性。 So, I decrease it to 0,2.所以,我将其减少到 0,2。

I want to ask, is that property take an effect on time performance my mapreduce job.我想问一下,该属性是否会影响我的 mapreduce 工作的时间性能。 So, how can I properly configure to make my mapreduce job never get an error?那么,如何正确配置以使我的 mapreduce 作业永远不会出错?

mapreduce.reduce.shuffle.input.buffer.percent 0.70 The percentage of memory to be allocated from the maximum heap size to storing map outputs during the shuffle. mapreduce.reduce.shuffle.input.buffer.percent 0.70 在随机播放期间从最大堆大小分配到存储映射输出的内存百分比。 From this it looks that that if you decrease this to a arbitrary value it may degrade the performance of the shuffle phase.由此看来,如果将其减小到任意值,则可能会降低 shuffle 阶段的性能。 There would have been certain reasoning and tests behind the default value You may check other related properties here http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml默认值背后会有一定的推理和测试您可以在这里查看其他相关属性http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default .xml

What is the approx data output by your mappers ,If that is huge then you may want to increase the number of mappers Likewise if the number of reducers is low heap space error could likely happen during reduce phase.您的映射器输出的大约数据是多少,如果它很大,那么您可能想要增加映射器的数量同样,如果减速器的数量很少,则在减速阶段可能会发生堆空间错误。

you may want to check your job counters and increase the number of mappers/reducers you may also try increasing the mapper/reducer memory by setting the properties mapreduce.reduce.memory.mb and mapreduce.map.memory.mb您可能想检查您的作业计数器并增加映射器/减速器的数量您也可以尝试通过设置属性mapreduce.reduce.memory.mbmapreduce.map.memory.mb 来增加映射器/减速器内存

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM