简体   繁体   中英

How to Find Common Keys in an Array of Dictionaries in Python

Given an array of dictionaries: [{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}, {}, ... ]

How would you find the most common keys among the dictionaries and rank them?

In this example the key of 'key1' appears twice, so this would be ranked number 1. Then if 'key2' would appear with the next common frequency, this would be ranked number 2, and so on.

Use this:

keys = dict()
# d is your dictionary
for i in d:
    for k, v in i.items():
        if k in keys:
            keys[k] += 1
        else:
            keys[k] = 1
max = 0
max_key = ''
for k, v in keys.items():
    if v > max:
        max = v
        max_key = k
print(max, max_key)

You can use pandas for that to parse the dict and then describe it to get stats .

import pandas as pd

lod = [{'points': 50, 'time': '5:00', 'year': 2010}, 
{'points': 25, 'time': '6:00', 'month': "february"}, 
{'points':90, 'time': '9:00', 'month': 'january'}, 
{'points_h1':20, 'month': 'june'}]

df = pd.DataFrame(lod)

print(df.describe(include=['float', 'object']))

'''
           points  time    year    month  points_h1
count    3.000000     3     1.0        3        1.0
unique        NaN     3     NaN        3        NaN
top           NaN  9:00     NaN  january        NaN
freq          NaN     1     NaN        1        NaN
mean    55.000000   NaN  2010.0      NaN       20.0
std     32.787193   NaN     NaN      NaN        NaN
min     25.000000   NaN  2010.0      NaN       20.0
25%     37.500000   NaN  2010.0      NaN       20.0
50%     50.000000   NaN  2010.0      NaN       20.0
75%     70.000000   NaN  2010.0      NaN       20.0
max     90.000000   NaN  2010.0      NaN       20.0
'''

Note however that this does not work with nested dictionaries! And you need to install pandas with pip .

is that your answer

from collections import Counter
d=[{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}, {'key1': 'valueC'}]
keys=[]
for i in d:
    for j in i.keys():
        keys.append(j)
        
Counter(keys)

output

Counter({'key1': 3, 'key2': 1, 'key3': 1})

You can use Counter to count that:

import collections

def count(array):
    counter = collections.Counter()
    for dict in array:
        counter.update(dict.keys())
    return counter

print(count([{'a': 1}, {'a': 2}])) # Counter({'a': 2})
print(count([{'f': 2}, {'f': '2'}, {'b': 10}])) # Counter({'f': 2, 'b': 1})

Another nice solution using pandas would be to convert the dictionaries in dataframes, merge them and then use value_counts . Here is a little example

import pandas as pd

d1 = {'key1': '1', 'key2': '2'}
d2 = {'key1': '1', 'key3': '3'}

df1 = pd.DataFrame.from_dict(d1.items())
df2 = pd.DataFrame.from_dict(d2.items())

frames = [df1, df2]
result = pd.concat(frames)
print(result[0].value_counts())

Which prints

key1    2
key2    1
key3    1
Name: 0, dtype: int64
# Extract the keys in a flat list
keys = [k for k, v in i.items for i in data]

# Sort and group
from itertools import sorted, groupby
s = sorted(keys)
g = groupby(s)

Another approach for using collections.Counter :

from collections import Counter
d=[{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}, {} ]
c = Counter(key for t in d for key in t.keys())

Outputs:

Counter({'key1': 2, 'key2': 1, 'key3': 1}) 

There are many ways you could do this, I assume one of the cleanest ways would be to use the Counter class, however I would do it the following way:

my_dicts = [{"hello": 2},{"hello1": 3, "hello": 5},{"hello2": 5, "hello": 7, "hello1": 4}]


merged = {}
for d in my_dicts:
  for k in d.keys():
    if k in merged:
      merged[k] += 1
    else:
      merged[k] = 1
results = {k: v for k, v in sorted(merged.items(), key=lambda item: item[1], reverse = True)}

The variable results ends up containing a dictionary of the sorted items in descending order, like such:

Output:

{'hello': 3, 'hello1': 2, 'hello2': 1}

Try this;

Code Syntax

dicts =[{'key1': 'valueA', 'key2': 'valueB'}, {'key1': 'valueC', 'key3': 'valueD'}]

lis = [list(dicts[i].keys()) for i in range(len(dicts))]
x = sorted([j for i in lis for j in i])
print(x)
for i, each in enumerate(x):
    if x[i-1] == x[i]:
        continue
    else:
        indey = x.count(x[x.index(each)])
        print(f"{each} = {indey}")

Output

['key1', 'key1', 'key2', 'key3']
key1 = 2
key2 = 1
key3 = 1

[Program finished]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM