[英]How to sum values for the same key
I have a file我有一个文件
gu|8
gt|5
gr|5
gp|1
uk|2
gr|20
gp|98
uk|1
me|2
support|6
And I want to have one number per TLD like:我希望每个 TLD 有一个号码,例如:
gr|25
gp|99
uk|3
me|2
support|6
gu|8
gt|5
and here is my code:这是我的代码:
f = open(file,'r')
d={}
for line in f:
line = line.strip('\n')
TLD,count = line.split('|')
d[TLD] = d.get(TLD)+count
print d
But I get this error:但我收到此错误:
d[TLD] = d.get(TLD)+count
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
Can anybody help?有人可以帮忙吗?
Taking a look at the full traceback:看一下完整的回溯:
Traceback (most recent call last):
File "mee.py", line 6, in <module>
d[TLD] = d.get(TLD) + count
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
The error is telling us that we tried to add something of type NoneType
to something of type str
, which isn't allowed in Python.该错误告诉我们,我们试图将NoneType
类型的NoneType
添加到str
类型的内容中,这在 Python 中是不允许的。
There's only one object of type NoneType
, which, unsurprisingly, is None
– so we know that we tried to add a string to None
.只有一个NoneType
类型的对象, NoneType
,它是None
- 所以我们知道我们试图向None
添加一个字符串。
The two things we tried to add together in that line were d.get(TLD)
and count
, and looking at the documentation for dict.get()
, we see that what it does is我们试图在该行中添加的两件事是d.get(TLD)
和count
,查看dict.get()
的文档,我们看到它的作用是
Return the value for key if key is in the dictionary, else default .如果key在字典中,则返回key的值,否则返回default 。 If default is not given, it defaults to
None
, so that this method never raises aKeyError
.如果默认没有给出,则默认为None
,所以,这种方法从未引发一个KeyError
。
Since we didn't supply a default , d.get(TLD)
returned None
when it didn't find TLD
in the dictionary, and we got the error attempting to add count
to it.由于我们没有提供default ,当d.get(TLD)
在字典中没有找到TLD
时返回None
,并且我们在尝试向其添加count
遇到错误。 So, let's supply a default of 0
and see what happens:所以,让我们提供一个默认值0
,看看会发生什么:
f = open('data','r')
d={}
for line in f:
line = line.strip('\n')
TLD, count = line.split('|')
d[TLD] = d.get(TLD, 0) + count
print d
$ python mee.py
Traceback (most recent call last):
File "mee.py", line 6, in <module>
d[TLD] = d.get(TLD, 0) + count
TypeError: unsupported operand type(s) for +: 'int' and 'str'
Well, we've still got an error, but now the problem is that we're trying to add a string to an integer, which is also not allowed, because it would be ambiguous .好吧,我们仍然有一个错误,但现在的问题是我们试图将一个字符串添加到一个整数中,这也是不允许的,因为它会产生歧义。
That's happening because line.split('|')
returns a list of strings – so we need to explicitly convert count
to an integer:这是因为line.split('|')
返回一个字符串列表——所以我们需要显式地将count
转换为一个整数:
f = open('data','r')
d={}
for line in f:
line = line.strip('\n')
TLD, count = line.split('|')
d[TLD] = d.get(TLD, 0) + int(count)
print d
... and now it works: ......现在它可以工作了:
$ python mee.py
{'me': 2, 'gu': 8, 'gt': 5, 'gr': 25, 'gp': 99, 'support': 6, 'uk': 3}
Turning that dictionary back into the file output you want is a separate issue (and not attempted by your code), so I'll leave you to work on that.将该字典转换回您想要的文件输出是一个单独的问题(而不是由您的代码尝试),因此我会让您自行处理。
To answer the title of your question: "how to sum values for the same key" - well, there is the builtin class called collections.Counter
that is a perfect match for you:要回答您的问题的标题:“如何对同一键的值求和” - 好吧,有一个名为collections.Counter
的内置类非常适合您:
import collections
d = collections.Counter()
with open(file) as f:
tld, cnt = line.strip().split('|')
d[tld] += int(cnt)
then to write back:然后写回:
with open(file, 'w') as f:
for tld, cnt in sorted(d.items()):
print >> f, "%s|%d" % (tld, cnt)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.