簡體   English   中英

如何在 Python 中的字典數組中查找公共鍵

[英]How to Find Common Keys in an Array of Dictionaries in Python

給定一個字典數組: [{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}, {}, ... ]

您將如何找到字典中最常用的鍵並對它們進行排名?

在此示例中,“key1”的鍵出現兩次,因此將排在第 1 位。然后,如果“key2”以下一個常見頻率出現,則將排在第 2 位,依此類推。

用這個:

keys = dict()
# d is your dictionary
for i in d:
    for k, v in i.items():
        if k in keys:
            keys[k] += 1
        else:
            keys[k] = 1
max = 0
max_key = ''
for k, v in keys.items():
    if v > max:
        max = v
        max_key = k
print(max, max_key)

您可以使用pandas解析 dict然后描述它以獲取 stats

import pandas as pd

lod = [{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]

df = pd.DataFrame(lod)

print(df.describe(include=['float', 'object']))

'''
           points  time    year    month  points_h1
count    3.000000     3     1.0        3        1.0
unique        NaN     3     NaN        3        NaN
top           NaN  9:00     NaN  january        NaN
freq          NaN     1     NaN        1        NaN
mean    55.000000   NaN  2010.0      NaN       20.0
std     32.787193   NaN     NaN      NaN        NaN
min     25.000000   NaN  2010.0      NaN       20.0
25%     37.500000   NaN  2010.0      NaN       20.0
50%     50.000000   NaN  2010.0      NaN       20.0
75%     70.000000   NaN  2010.0      NaN       20.0
max     90.000000   NaN  2010.0      NaN       20.0
'''

但是請注意,這不適用於嵌套字典! 你需要安裝pandaspip

那是你的答案嗎

from collections import Counter
d=[{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}, {'key1': 'valueC'}]
keys=[]
for i in d:
    for j in i.keys():
        keys.append(j)
        
Counter(keys)

output

Counter({'key1': 3, 'key2': 1, 'key3': 1})

您可以使用Counter來計算:

import collections

def count(array):
    counter = collections.Counter()
    for dict in array:
        counter.update(dict.keys())
    return counter

print(count([{'a': 1}, {'a': 2}])) # Counter({'a': 2})
print(count([{'f': 2}, {'f': '2'}, {'b': 10}])) # Counter({'f': 2, 'b': 1})

使用pandas的另一個不錯的解決方案是轉換數據幀中的字典,合並它們,然后使用value_counts 這是一個小例子

import pandas as pd

d1 = {'key1': '1', 'key2': '2'}
d2 = {'key1': '1', 'key3': '3'}

df1 = pd.DataFrame.from_dict(d1.items())
df2 = pd.DataFrame.from_dict(d2.items())

frames = [df1, df2]
result = pd.concat(frames)
print(result[0].value_counts())

哪個打印

key1    2
key2    1
key3    1
Name: 0, dtype: int64
# Extract the keys in a flat list
keys = [k for k, v in i.items for i in data]

# Sort and group
from itertools import sorted, groupby
s = sorted(keys)
g = groupby(s)

使用collections.Counter的另一種方法:

from collections import Counter
d=[{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}, {} ]
c = Counter(key for t in d for key in t.keys())

輸出:

Counter({'key1': 2, 'key2': 1, 'key3': 1}) 

有很多方法可以做到這一點,我認為最干凈的方法之一是使用計數器 class,但是我會這樣做:

my_dicts = [{"hello": 2},{"hello1": 3, "hello": 5},{"hello2": 5, "hello": 7, "hello1": 4}]


merged = {}
for d in my_dicts:
  for k in d.keys():
    if k in merged:
      merged[k] += 1
    else:
      merged[k] = 1
results = {k: v for k, v in sorted(merged.items(), key=lambda item: item[1], reverse = True)}

變量results最終包含按降序排列的已排序項目的字典,如下所示:

Output:

{'hello': 3, 'hello1': 2, 'hello2': 1}

嘗試這個;

代碼語法

dicts =[{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}]

lis = [list(dicts[i].keys()) for i in range(len(dicts))]
x = sorted([j for i in lis for j in i])
print(x)
for i, each in enumerate(x):
    if x[i-1] == x[i]:
        continue
    else:
        indey = x.count(x[x.index(each)])
        print(f"{each} = {indey}")

Output

['key1', 'key1', 'key2', 'key3']
key1 = 2
key2 = 1
key3 = 1

[Program finished]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM