python为什么不将stdin输入作为字典读取？

Question

I'm sure I'm doing something dumb here, but here goes. 我敢肯定，我在这里做些愚蠢的事情，但是这已经过去了。 I'm working on a class assignment for my Udacity class "Intro to Map Reduce and Hadoop". 我正在为Udacity课程“ Map Reduce和Hadoop简介”进行课程分配。 Our assignment is to make a mapper/reducer that will count occurrences of a word across our data set (the body of forum posts). 我们的任务是制作一个映射器/归约器，该映射器/归约器将计算整个数据集（论坛帖子正文）中某个单词的出现次数。 I've got an idea of how to do this, but I can't get Python to read in stdin data to the reducer as a dictionary. 我已经知道如何执行此操作，但是我无法让Python将stdin数据作为字典读入reducer。

Here's my approach thus far: Mapper reads through the data (in this case in the code) and spits out a dictionary of word:count for each forum post: 到目前为止，这是我的方法：Mapper读取数据（在本例中为代码），并为每个论坛帖子吐出word：count字典：

#!/usr/bin/python
import sys
import csv
import re
from collections import Counter


def mapper():
    reader = csv.reader(sys.stdin, delimiter='\t')
    writer = csv.writer(sys.stdout, delimiter='\t', quotechar='"', quoting=csv.QUOTE_ALL)

    for line in reader:
        body = line[4]
        #Counter(body)
        words = re.findall(r'\w+', body.lower())
        c = Counter(words)
        #print c.items()
        print dict(c)





test_text = """\"\"\t\"\"\t\"\"\t\"\"\t\"This is one sentence sentence\"\t\"\"
\"\"\t\"\"\t\"\"\t\"\"\t\"Also one sentence!\"\t\"\"
\"\"\t\"\"\t\"\"\t\"\"\t\"Hey!\nTwo sentences!\"\t\"\"
\"\"\t\"\"\t\"\"\t\"\"\t\"One. Two! Three?\"\t\"\"
\"\"\t\"\"\t\"\"\t\"\"\t\"One Period. Two Sentences\"\t\"\"
\"\"\t\"\"\t\"\"\t\"\"\t\"Three\nlines, one sentence\n\"\t\"\"
"""

# This function allows you to test the mapper with the provided test string
def main():
    import StringIO
    sys.stdin = StringIO.StringIO(test_text)
    mapper()
    sys.stdin = sys.__stdin__

if __name__ == "__main__":
    main()

the count of forum post then goes to stdout like: {'this': 1, 'is': 1, 'one': 1, 'sentence': 2} 然后，论坛帖子的数量进入标准输出，例如： {'this': 1, 'is': 1, 'one': 1, 'sentence': 2}

then the reducer should read in this stdin as a dictionary 那么reducer应该在这个stdin中读为字典

#!/usr/bin/python
import sys
from collections import Counter, defaultdict
for line in sys.stdin.readlines():
    print dict(line)

but that fails, giving me this error message: ValueError: dictionary update sequence element #0 has length 1; 2 is required 但是失败了，给了我这个错误消息： ValueError: dictionary update sequence element #0 has length 1; 2 is required ValueError: dictionary update sequence element #0 has length 1; 2 is required

Which means (if I understand correctly) that it's reading in each line not as a dict, but as a text string. 这意味着（如果我理解正确的话）意味着它在每一行中的读取不是作为字典，而是作为文本字符串。 How can I get python to understand that input line is a dict? 我如何让python理解输入行是字典？ I've tried using Counter and defaultdict, but still had the same problem or had it read in each character as an element of list, which is also not what I want. 我曾尝试使用Counter和defaultdict，但是仍然遇到相同的问题，或者将其读入每个字符作为list的元素，这也不是我想要的。

Ideally, I want the mapper to read in the dict of each line, then add the values of the next line, so after the second line the values are {'this':1,'is':1,'one':2,'sentence':3,'also':1} and so on. 理想情况下，我希望映射器读取每行的字典，然后添加下一行的值，因此在第二行之后，值是{'this':1,'is':1,'one':2,'sentence':3,'also':1} ，依此类推。

Thanks, JR 谢谢，JR

Answer 1

Thanks to @keyser, the ast.literal_eval() method worked for me. 感谢@keyser，ast.literal_eval（）方法为我工作。 Here's what I have now: 这是我现在所拥有的：

#!/usr/bin/python
import sys
from collections import Counter, defaultdict
import ast
lineDict = {}
c = Counter()
for line in sys.stdin.readlines():
    lineDict = ast.literal_eval(line)
    c.update(lineDict)
print c.most_common()

python为什么不将stdin输入作为字典读取？

问题描述

1 个解决方案

解决方案1
1 2014-08-12 18:45:05

python为什么不将stdin输入作为字典读取？

问题描述

1 个解决方案

解决方案1 1 2014-08-12 18:45:05

解决方案1
1 2014-08-12 18:45:05