[英]Sum values with Python dictionary
I wonder how to sum values using Python dictionary. 我想知道如何使用Python字典求和。 I read huge file line by line and increment value for each particular key.
我逐行读取巨大的文件,并为每个特定键增加值。 Suppose I have the following toy file:
假设我有以下玩具文件:
word1 5
word2 3
word3 1
word1 2
word2 1
The desired result I expected is: 我期望的预期结果是:
my_dict = {'word1':7, 'word2':4, 'word3':1}
Pasted below is my current work. 下面粘贴的是我当前的工作。
my_dict = {}
with open('test.txt') as f:
for line in f:
line = line.rstrip()
line = line.split()
word = line[0]
frequency = line[1]
my_dict[word] += int(frequency)
Use a collections.Counter()
object : 使用
collections.Counter()
对象 :
from collections import Counter
my_dict = Counter()
with open('test.txt') as f:
for line in f:
word, freq = line.split()
my_dict[word] += int(freq)
Note that str.rstrip()
is not needed, the str.split()
call with no arguments also strips the string. 需要注意的是
str.rstrip()
是不需要的,在str.split()
不带参数调用也去掉的字符串。
Apart from defaulting non-existing keys to 0, Counter()
objects have additional advantages, such as listing words ordered by frequency (including a top N), summing and subtracting. 除了将不存在的键默认设置为0外,
Counter()
对象还有其他优点,例如列出按频率排序的单词(包括前N个),求和和减去。
The above code results in: 上面的代码导致:
>>> my_dict
Counter({'word1': 7, 'word2': 4, 'word3': 1})
>>> for word, freq in my_dict.most_common():
... print word, freq
...
word1 7
word2 4
word3 1
You can use a defaultdict
: 您可以使用
defaultdict
:
import collections
d = collections.defaultdict(int)
with open('text.txt') as f:
for row in f:
temp = row.split()
d[temp[0]] += int(temp[1])
d
is now: d
现在:
defaultdict(<type 'int'>, {'word1': 7, 'word3': 1, 'word2': 4})
IN case someone is working with multiple columns (in my case I had the same problem but with 4 columns): 万一有人使用多个列(在我的情况下,我有同样的问题,但是有四个列):
This should do the trick: 这应该可以解决问题:
from collections import defaultdict
my_dict = defaultdict(int)
with open("input") as f:
for line in f:
if line.strip():
items = line.split()
freq = items[-1]
lemma = tuple(items[:-1])
my_dict[lemma] += int(freq)
for items, freq in my_dict.items():
print items, freq
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.