[英]How to calculate the average number from a text file with MRJob
I am a beginner with MrJob and having trouble calculating an average prime number from a text file of prime numbers. 我是MrJob的初学者,无法从素数文本文件中计算平均素数。 I am unsure at which part to apply arithmetic logic and also whether I should yield lists when using MrJob. 我不确定在哪一部分应用算术逻辑,也不确定在使用MrJob时是否应该产生列表。 The text file contains one million primes and this is what I've come up so far, I don't understand what the key value should be in my case. 文本文件包含一百万个素数,这是到目前为止我要提出的内容,我不了解我的情况下关键值应该是什么。
%%writefile prime_average.py
from mrjob.job import MRJob
class primeAverages(MRJob):
def mapper(self, _, line):
results = []
for x in line.split():
if(x.isdigit()):
yield x, 1
def reducer(self, word, key):
yield word, sum(word)/len(key)
you can use something like: 您可以使用类似:
def mapper(self, _, line):
if line.isdigit():
yield (None, int(line))
def reducer(self, key, values):
s = 0 #sum of primes
c = 0 #number of primes
for p in values:
s += p
c += 1
yield (None, s / c)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.