简体   繁体   中英

Histogram in python from a list of dictionaries

I have a list of dictionaries:

list_of_dicts = [{'user':user1, 'yob':1984, 'saves':24, 'hidden':28},
{'user':user2, 'yob':1989, 'saves':7, 'hidden':51}, {...}, ...]

And I want to make a stacked histogram plot of saves and hidden on the y-axis and yob on the x-axis. So the histogram should bin by yob and sum the number of saves or hidden for each dictionary in the list. For example, if there are 3 dictionaries with the common yob 1998 with saves of 8, 19 and 4 then the total saves for yob 1998 should be 31 and the hist plot for saves should be 31 in height in 1998. So something like:

plt.hist([list_of_dicts['yob']['saves'], list_of_dicts['yob']['hidden']],
bins=45, stacked=True)
plt.show()

Not sure on the syntax to do this or how to access the elements in the list correctly, could anyone help? Thanks *Edit: I know you can't index a list using a string (list_of_dicts['yob']) but that is where I am stuck and asking the question.

you can first put everything in a single dictionary,

master_dict = {}
# initialize the arrays first
for key in list_of_dicts[0]:
    master_dict[key] = [d[key] for d in list_of_dicts]

Then, use pandas to bin by 'yob' :

import pandas as pd
df = pd.DataFrame(master_dict)
bins = numpy.linspace(df.yob.min(), df.yob.max(), 45)
cut = pd.cut(df.yob, bins)
group = df.groupby(cut)

The following lines give you the sum of each of the other dictionary items binned by yob :

nsaves = group.saves.sum()
nhidden = group.hidden.sum()

which you can then plot with the bins defined above using plt.step or plt.bar :

plt.step(bins[:-1], nsaves, color='r', where='pre')
plt.step(bins[:-1], nhidden, color='b', where='pre')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM