简体   繁体   English

地图减少概念

[英]Map-reduce Concept

What type of the inputs and outputs do the map and reduce functions in MapReduce use? MapReduce功能在MapReduce中使用哪种类型的输入和输出? How are the inputs and outputs of the two functions connected? 两个功能的输入和输出如何连接?

The input of map function in MapReduce is a document MapReduce中的map函数的输入是一个文档

The output of map function in MapReduce is a sequence of tuple(word,1) MapReduce中map函数的输出是一个tuple(word,1)序列

The input of reduce function in MapReduce is a key and a list of all values of that key MapReduce中reduce函数的输入是一个键和该键所有值的列表

The output of reduce function in MapReduce is a sequence of tuples(word, number of occurrences) MapReduce中reduce函数的输出是一个元组序列(单词,出现次数)

Is it correct? 这是正确的吗? what about the connected functions, is combiner? 合并的功能如何?

The inputs and outputs are connected via serialization. 输入和输出通过串行连接。

The default input is TextInputFormat which uses a LineRecordReader , however both of these properties can be overridden 默认输入是使用LineRecordReader TextInputFormat ,但是这两个属性都可以被覆盖

Underneath, everything is only bytes, and the Writable objects in MapReduce (Text, IntWritable, etc) are just thin layers over a byte[] 在下面,所有内容都只是字节,而MapReduce中的Writable对象(Text,IntWritable等)只是一个byte[]薄层byte[]

Reducer input is the joined output of a mapper, by key, yes. Reducer输入是映射器的联合输出,按键,是。 Output are key value pairs, or a tuple. 输出是键值对或元组。 But both values can be complex objects, so you output more than just two fields. 但是,两个值都可以是复杂的对象,因此您输出的不仅仅是两个字段。 A Combiner is just a different type Reducer. 组合器只是另一种类型的减速器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM