簡體 English 中英

Mapper組合器修補程序排序/排序

[英]Order of Mapper Combiner patitioner shuffle/sort

原文 2015-01-06 01:10:38 9 1 hadoop

我在第206頁的《定額指南：Hadoop》中有以下內容。

在將數據寫入磁盤之前，線程首先將數據划分為與最終將要發送到這些約化器的分區。 在每個分區中，后台線程通過鍵執行內存中排序，如果有組合器功能，它將在排序的輸出上運行。 運行組合器功能可實現更緊湊的映射輸出，因此更少的數據可寫入本地磁盤並傳輸到reducer。

因此，有了這種理解，我可以將順序排序為Mapper，分區器，隨機播放/排序，Combiner嗎？

1 個解決方案

我為此寫了一篇很好的文章：http: //0x0fff.com/hadoop-mapreduce-comprehensive-description/一般來說，您是對的，但特別是還有很多其他情況-某些情況下可能會省略合並器記錄，對於其中一些記錄可能會運行很多次，甚至可以使合並器在reducer之前在reduce端啟動。 所以您總體上是正確的，但是事情要復雜得多

組合器中的映射器輸出增加了一倍

[英]Mapper output doubled in combiner

Hadoop配置 - 是否受io.sort.factor和io.sort.mb影響的映射器/組合器？

[英]Hadoop configuration - are mapper/combiner affected by io.sort.factor and io.sort.mb?

MapReduce中的合並器和映射器合並器之間的區別？

[英]Difference between combiner and in-mapper combiner in mapreduce?

Hadoop組合器排序階段

[英]Hadoop combiner sort phase

hadoop中用於映射器和組合器的不同上下文類型

[英]Different context types in hadoop for mapper and combiner

為每個映射器輸出准確運行組合器的時間

[英]when exactly the combiner runs for each mapper output

hadoop將映射器，分區器和組合器的輸出文件存儲在哪里？

[英]where does hadoop store the output files of mapper, partitioner and combiner?

如何在MapReduce程序中使用本地聚合方法，例如映射器中的合並器？

[英]How to use local aggregation methods in MapReduce programs such as in-mapper combiner?

如果hadoop中有兩個映射器，則僅一個映射器的組合器

[英]Combiner for just one mapper, in cases where there are two mappers in hadoop

shuffle階段和組合階段之間有什么區別？

[英]What's the difference between shuffle phase and combiner phase?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 組合器中的映射器輸出增加了一倍 Hadoop配置 - 是否受io.sort.factor和io.sort.mb影響的映射器/組合器？ MapReduce中的合並器和映射器合並器之間的區別？ Hadoop組合器排序階段 hadoop中用於映射器和組合器的不同上下文類型為每個映射器輸出准確運行組合器的時間 hadoop將映射器，分區器和組合器的輸出文件存儲在哪里？如何在MapReduce程序中使用本地聚合方法，例如映射器中的合並器？如果hadoop中有兩個映射器，則僅一個映射器的組合器 shuffle階段和組合階段之間有什么區別？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM