简体   繁体   English

如何使用MRJob从文本文件计算平均值

[英]How to calculate the average number from a text file with MRJob

I am a beginner with MrJob and having trouble calculating an average prime number from a text file of prime numbers. 我是MrJob的初学者,无法从素数文本文件中计算平均素数。 I am unsure at which part to apply arithmetic logic and also whether I should yield lists when using MrJob. 我不确定在哪一部分应用算术逻辑,也不确定在使用MrJob时是否应该产生列表。 The text file contains one million primes and this is what I've come up so far, I don't understand what the key value should be in my case. 文本文件包含一百万个素数,这是到目前为止我要提出的内容,我不了解我的情况下关键值应该是什么。

%%writefile prime_average.py
from mrjob.job import MRJob

class primeAverages(MRJob):

def mapper(self, _, line):
    results = []
    for x in line.split():
        if(x.isdigit()):
            yield x, 1

def reducer(self, word, key):
    yield word, sum(word)/len(key)

you can use something like: 您可以使用类似:

def mapper(self, _, line):
    if line.isdigit():
        yield (None, int(line))

def reducer(self, key, values):
    s = 0 #sum of primes
    c = 0 #number of primes
    for p in values:
        s += p
        c += 1
    yield (None, s / c)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM