简体   繁体   English

如何对相同键的值求和

[英]How to sum values for the same key

I have a file我有一个文件

gu|8
gt|5
gr|5
gp|1
uk|2
gr|20
gp|98
uk|1
me|2
support|6

And I want to have one number per TLD like:我希望每个 TLD 有一个号码,例如:

 gr|25
 gp|99
 uk|3
 me|2
 support|6
 gu|8
 gt|5

and here is my code:这是我的代码:

f = open(file,'r')
d={}
for line in f:
    line = line.strip('\n')
    TLD,count = line.split('|')
    d[TLD] = d.get(TLD)+count

print d

But I get this error:但我收到此错误:

    d[TLD] = d.get(TLD)+count
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

Can anybody help?有人可以帮忙吗?

Taking a look at the full traceback:看一下完整的回溯:

Traceback (most recent call last):
  File "mee.py", line 6, in <module>
    d[TLD] = d.get(TLD) + count
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

The error is telling us that we tried to add something of type NoneType to something of type str , which isn't allowed in Python.该错误告诉我们,我们试图将NoneType类型的NoneType添加到str类型的内容中,这在 Python 中是不允许的。

There's only one object of type NoneType , which, unsurprisingly, is None – so we know that we tried to add a string to None .只有一个NoneType类型的对象, NoneType ,它是None - 所以我们知道我们试图向None添加一个字符串。

The two things we tried to add together in that line were d.get(TLD) and count , and looking at the documentation for dict.get() , we see that what it does is我们试图在该行中添加的两件事是d.get(TLD)count ,查看dict.get()的文档,我们看到它的作用是

Return the value for key if key is in the dictionary, else default .如果key在字典中,则返回key的值,否则返回default If default is not given, it defaults to None , so that this method never raises a KeyError .如果默认没有给出,则默认为None ,所以,这种方法从未引发一个KeyError

Since we didn't supply a default , d.get(TLD) returned None when it didn't find TLD in the dictionary, and we got the error attempting to add count to it.由于我们没有提供default ,当d.get(TLD)在字典中没有找到TLD时返回None ,并且我们在尝试向其添加count遇到错误。 So, let's supply a default of 0 and see what happens:所以,让我们提供一个默认值0 ,看看会发生什么:

f = open('data','r')
d={}
for line in f:
    line = line.strip('\n')
    TLD, count = line.split('|')
    d[TLD] = d.get(TLD, 0) + count

print d
$ python mee.py
Traceback (most recent call last):
  File "mee.py", line 6, in <module>
    d[TLD] = d.get(TLD, 0) + count
TypeError: unsupported operand type(s) for +: 'int' and 'str'

Well, we've still got an error, but now the problem is that we're trying to add a string to an integer, which is also not allowed, because it would be ambiguous .好吧,我们仍然有一个错误,但现在的问题是我们试图将一个字符串添加到一个整数中,这也是不允许的,因为它会产生歧义

That's happening because line.split('|') returns a list of strings – so we need to explicitly convert count to an integer:这是因为line.split('|')返回一个字符串列表——所以我们需要显式地将count转换为一个整数:

f = open('data','r')
d={}
for line in f:
    line = line.strip('\n')
    TLD, count = line.split('|')
    d[TLD] = d.get(TLD, 0) + int(count)

print d

... and now it works: ......现在它可以工作了:

$ python mee.py 
{'me': 2, 'gu': 8, 'gt': 5, 'gr': 25, 'gp': 99, 'support': 6, 'uk': 3}

Turning that dictionary back into the file output you want is a separate issue (and not attempted by your code), so I'll leave you to work on that.将该字典转换回您想要的文件输出是一个单独的问题(而不是由您的代码尝试),因此我会让您自行处理。

To answer the title of your question: "how to sum values for the same key" - well, there is the builtin class called collections.Counter that is a perfect match for you:要回答您的问题的标题:“如何对同一键的值求和” - 好吧,有一个名为collections.Counter的内置类非常适合您:

import collections
d = collections.Counter()
with open(file) as f:
    tld, cnt = line.strip().split('|')
    d[tld] += int(cnt)

then to write back:然后写回:

with open(file, 'w') as f:
    for tld, cnt in sorted(d.items()):
        print >> f, "%s|%d" % (tld, cnt)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM