簡體 English 中英

當我們不在Mapreduce中使用Combiner時？

[英]When we are not use Combiner in Mapreduce?

原文 2015-04-17 10:33:20 1 2 hadoop/ mapreduce

每個Hadoop開發人員都知道Combiner是優化mapreduce的關鍵，但是它是可選的。 它可以最小化帶寬並提高mapreduce作業的性能。 在這里，我的問題是，hadoop會將許多功能默認設置為數據局部性問題，而不是將Combiner設置為默認值。 為什么？ 這意味着在所有情況下都不推薦使用合並器嗎？ 什么時候不使用組合器？ 如果我將其設置為默認值，會有什么問題？

2 個解決方案

如果reduce函數既是可交換的又是關聯的，則可以使用Combiner。 這是因為值在隨機排序之前在本地進行了組合。

可交換的 -根據值處理操作的順序在某種程度上對結果沒有影響：

1 + 2 + 3 = 1 + 3 + 2

關聯 -我們根據值處理操作的順序在某種程度上對結果沒有影響：

（1 + 2）+ 3 = 1 +（2 + 3）

因此，最好將合並器用於sum()操作，但是有些操作對它不起作用。 因此，決定組合器是否可用於特定算法始終是程序員的責任。

如果您在工作中設置組合器，則Hadoop將根據數據決定是否運行組合器。

但是，如果您不設置組合器，那么Hadoop將不會運行組合器。

合並器運行時會減小輸出的大小，因此少量數據會在網絡中傳輸。

有關合路器和減速器的區別，請檢查以下鏈接：

http://blog.optimal.io/3-differences-between-a-mapreduce-combiner-and-reducer/

當我們在 Hadoop MapReduce 中使用多個輸入時，組合器如何工作

[英]How combiner works when we use multiple inputs in Hadoop MapReduce

在MapReduce中調用完全合並器時？

[英]When Exactly Combiner is called in MapReduce?

Mapreduce組合器

[英]Mapreduce Combiner

將合並器用於mapreduce二級排序

[英]Putting combiner to use in mapreduce secondary sorting

Hadoop MapReduce運行組合器的確切來源

[英]Definitive source for when Hadoop MapReduce Runs a Combiner

如何在MapReduce程序中使用本地聚合方法，例如映射器中的合並器？

[英]How to use local aggregation methods in MapReduce programs such as in-mapper combiner?

我可以使用組合器來計算 mapreduce 作業中的平均值嗎？

[英]Can I use Combiner to compute average in a mapreduce job?

使用合並器執行MapReduce的時間

[英]Execuation time of MapReduce with Combiner

Hadoop HBase MapReduce組合器

[英]hadoop hbase mapreduce combiner

mapreduce作業中的“Combiner”類

[英]“Combiner" Class in a mapreduce job

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 當我們在 Hadoop MapReduce 中使用多個輸入時，組合器如何工作在MapReduce中調用完全合並器時？ Mapreduce組合器將合並器用於mapreduce二級排序 Hadoop MapReduce運行組合器的確切來源如何在MapReduce程序中使用本地聚合方法，例如映射器中的合並器？我可以使用組合器來計算 mapreduce 作業中的平均值嗎？使用合並器執行MapReduce的時間 Hadoop HBase MapReduce組合器 mapreduce作業中的“Combiner”類

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM