![](/img/trans.png)
[英]How to do a Reduce Side Join as a Map Reduce Job with mrjob in Python
[英]python - How to use map reduce MRJob
我需要从 MRJob 应用 map reduce 功能,但我无法到达。 我有一个包含两个代码和一个句子的大列表,如下所示:
enter code here
L = ['E-0053 C-0169 It's goig to be a good day\n', 'D-0312 B-0291 Peter has arrived late\n', 'A-
0417 B-0187 for more information please call the following number\n']
我需要使用 map reduce 来获取一个列表,该列表计算代码中每对字母组合的每个句子的单词数。 例如,列表示例的解决方案是:
enter code here
[EC 6, DB 4, AB 8]
我试过:
enter code here
C1 = [i [0] for i in L]
C2 = [i [7] for i in L]
C1_C2 = [C1[i]+C2[i] for i in range(len(C1))]
class count(MRJob):
def mapper(self, _, C1_C2):
[elem.split() for elem in L]
yield C1_C2, [(len(i)-2) for i in sentence]
def reducer(self, key, values):
yield key, sum(values)
count.run()
你可以试试这个:
L = [
"E-0053 C-0169 It's goig to be a good day\n",
"D-0312 B-0291 Peter has arrived late\n",
"A-0417 B-0187 for more information please call the following number\n"
]
result = [i[0] + i[7] + " " + str(len(i.split()) - 2) for i in L]
print(result)
输出 :
['EC 7', 'DB 4', 'AB 8']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.