Python计算字典中值的出现

Question

I'm trying to calculate the occurrence of countries in a dictionary. 我正在尝试计算字典中国家/地区的出现次数。 I am reading all country`s with a for loop from a CSV file. 我正在从CSV文件中读取带有for循环的所有国家/地区。 And add them to a list: 并将它们添加到列表中：

landen = []
landen.append({"Datum": datumbestand, "Land": [land]})

Then I try to combine all countries by date: 然后，我尝试按日期合并所有国家/地区：

scores_unique = {}
for item in landen:
    if item['Datum'] not in scores_unique:
        scores_unique.update({item['Datum']: item['Land']})
    else:
        scores_unique[item['Datum']] += item['Land']

When I print my output I get the following (A little part of my data): 当我打印输出时，得到以下内容（数据的一小部分）：

[('2017-11-20', [US', 'US', 'US', 'US', 'SK', 'SK', 'IE', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'ES', 'ES', 'DE', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 
('2017-11-10', ['US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US',

Now I would like to see from every date the countries who are most occurrence. 现在，我想从每个日期看到最常出现的国家。 Something like: 就像是：

2017-11-20:
USA 10x
SK 3x
IE 2x

2017-11-10
USA 20x
GB 15x

And see the diffrence in occurence from every date. 并查看每个日期发生的差异。 But I been trying a long time but I cant manage to count the occurence and print it. 但是我已经尝试了很长时间，但是我无法设法计数并打印出来。

Answer 1

You don't need to keep duplicated copies of the same items in a list. 您无需在列表中保留相同项目的重复副本。 Use a collections.Counter object to keep count of each object reading straight from your CSV reader/file, keying each counter on the corresponding date in a collections.defaultdict : 使用collections.Counter对象可保持直接从CSV阅读器/文件读取的每个对象的计数，并在collections.defaultdict的相应日期键入每个计数器：

from collections import Counter, defaultdict

d = defaultdict(Counter)

for date, country in csv_reader:
    d[date][country] += 1

You can then use the most_common method of the Counter objects to get the countries with the most occurrence at each date: 然后，您可以使用Counter对象的most_common方法来获取每个日期出现次数最多的国家/地区：

for date, counter in d.items():
    print(date, counter.most_common(3))

Answer 2

You can create a dictionary from the list and use the count() function to do so. 您可以从列表中创建字典，然后使用count()函数来创建字典。

This will roughly work in the following way: 这大致可以通过以下方式工作：

result_occurrences = {i:scores_unique.count(i) for i in scores_unique}
print result_occurrences

This will work in Python 2.7. 这将在Python 2.7中工作。 For Python 3 you can write: 对于Python 3，您可以编写：

result_occurrences = {i:list(scores_unique.values()).count(i) for i in scores_unique}
print(result_occurrences)

Another way to do this is by using Collections.Counter . 另一种方法是使用Collections.Counter 。

Answer 3

Here is a solution based on pandas apply valuecounts ie 这是一个基于熊猫应用价值计数的解决方案，即

import pandas as pd    
tup= [('2017-11-20', ['US', 'US', 'US', 'US', 'SK', 'SK', 'IE', 'GB', 
 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 
 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB',
 'GB', 'GB', 'GB', 'ES', 'ES', 'DE', 'CA', 'CA', 'CA', 'CA', 'CA', 
 'CA', 'CA', 'CA', 'CA', 'CA']), 
 ('2017-11-10', ['US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US'])]

count = pd.DataFrame(tup).set_index(0)[1].apply(pd.Series.value_counts).stack()

2017-11-20  CA    10.0
            DE     1.0
            ES     2.0
            GB    28.0
            IE     1.0
            SK     2.0
            US     4.0
2017-11-10  US    61.0
dtype: float64

count.to_dict()

{('2017-11-20', 'ES'): 2.0, ('2017-11-20', 'US'): 4.0, ('2017-11-20', 'CA'): 10.0, ('2017-11-20', 'GB'): 28.0, ('2017-11-20', 'SK'): 2.0, ('2017-11-20', 'IE'): 1.0, ('2017-11-10', 'US'): 61.0, ('2017-11-20', 'DE'): 1.0}

Python计算字典中值的出现

问题描述

3 个解决方案

解决方案1
2 已采纳 2017-11-20 13:39:05

解决方案2
2 2017-11-20 13:39:44

解决方案3
1 2017-11-20 13:46:02

Python计算字典中值的出现

问题描述

3 个解决方案

解决方案1 2 已采纳 2017-11-20 13:39:05

解决方案2 2 2017-11-20 13:39:44

解决方案3 1 2017-11-20 13:46:02

解决方案1
2 已采纳 2017-11-20 13:39:05

解决方案2
2 2017-11-20 13:39:44

解决方案3
1 2017-11-20 13:46:02