Python計算字典中值的出現

Question

我正在嘗試計算字典中國家/地區的出現次數。 我正在從CSV文件中讀取帶有for循環的所有國家/地區。 並將它們添加到列表中：

landen = []
landen.append({"Datum": datumbestand, "Land": [land]})

然后，我嘗試按日期合並所有國家/地區：

scores_unique = {}
for item in landen:
    if item['Datum'] not in scores_unique:
        scores_unique.update({item['Datum']: item['Land']})
    else:
        scores_unique[item['Datum']] += item['Land']

當我打印輸出時，得到以下內容（數據的一小部分）：

[('2017-11-20', [US', 'US', 'US', 'US', 'SK', 'SK', 'IE', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'ES', 'ES', 'DE', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 'CA', 
('2017-11-10', ['US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US',

現在，我想從每個日期看到最常出現的國家。 就像是：

2017-11-20:
USA 10x
SK 3x
IE 2x

2017-11-10
USA 20x
GB 15x

並查看每個日期發生的差異。 但是我已經嘗試了很長時間，但是我無法設法計數並打印出來。

Answer 1

您無需在列表中保留相同項目的重復副本。 使用collections.Counter對象可保持直接從CSV閱讀器/文件讀取的每個對象的計數，並在collections.defaultdict的相應日期鍵入每個計數器：

from collections import Counter, defaultdict

d = defaultdict(Counter)

for date, country in csv_reader:
    d[date][country] += 1

然后，您可以使用Counter對象的most_common方法來獲取每個日期出現次數最多的國家/地區：

for date, counter in d.items():
    print(date, counter.most_common(3))

Answer 2

您可以從列表中創建字典，然后使用count()函數來創建字典。

這大致可以通過以下方式工作：

result_occurrences = {i:scores_unique.count(i) for i in scores_unique}
print result_occurrences

這將在Python 2.7中工作。 對於Python 3，您可以編寫：

result_occurrences = {i:list(scores_unique.values()).count(i) for i in scores_unique}
print(result_occurrences)

另一種方法是使用Collections.Counter 。

Answer 3

這是一個基於熊貓應用價值計數的解決方案，即

import pandas as pd    
tup= [('2017-11-20', ['US', 'US', 'US', 'US', 'SK', 'SK', 'IE', 'GB', 
 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 
 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB', 'GB',
 'GB', 'GB', 'GB', 'ES', 'ES', 'DE', 'CA', 'CA', 'CA', 'CA', 'CA', 
 'CA', 'CA', 'CA', 'CA', 'CA']), 
 ('2017-11-10', ['US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 'US', 
'US', 'US', 'US', 'US'])]

count = pd.DataFrame(tup).set_index(0)[1].apply(pd.Series.value_counts).stack()

2017-11-20  CA    10.0
            DE     1.0
            ES     2.0
            GB    28.0
            IE     1.0
            SK     2.0
            US     4.0
2017-11-10  US    61.0
dtype: float64

count.to_dict()

{('2017-11-20', 'ES'): 2.0, ('2017-11-20', 'US'): 4.0, ('2017-11-20', 'CA'): 10.0, ('2017-11-20', 'GB'): 28.0, ('2017-11-20', 'SK'): 2.0, ('2017-11-20', 'IE'): 1.0, ('2017-11-10', 'US'): 61.0, ('2017-11-20', 'DE'): 1.0}

Python計算字典中值的出現

問題描述

3 個解決方案

解決方案1
2 已采納 2017-11-20 13:39:05

解決方案2
2 2017-11-20 13:39:44

解決方案3
1 2017-11-20 13:46:02

Python計算字典中值的出現

問題描述

3 個解決方案

解決方案1 2 已采納 2017-11-20 13:39:05

解決方案2 2 2017-11-20 13:39:44

解決方案3 1 2017-11-20 13:46:02

解決方案1
2 已采納 2017-11-20 13:39:05

解決方案2
2 2017-11-20 13:39:44

解決方案3
1 2017-11-20 13:46:02