[英]Python - How to count the frequency of each unique key from a column containing a dictionary of dictionaries?
I have a very large dataframe containing a column called 'time_words'.我有一个非常大的 dataframe,其中包含一个名为“time_words”的列。 Each cell of the column contains a list of dictionaries, for example:该列的每个单元格都包含一个字典列表,例如:
time_columns时间列 |
---|
{' Yesterday ': {'text': 'Yesterday', 'type': 'DATE', 'value': '2022-04-15'}} {'昨天':{'text':'昨天','type':'DATE','value':'2022-04-15'}} |
{' Yesterday ': {'text': 'Yesterday', 'type': 'DATE', 'value': '2022-04-16'}, 'Thursday': {'text': 'Thursday', 'type': 'DATE', 'value': '2022-04-14'}} {'昨天':{'text':'昨天','type':'DATE','value':'2022-04-16'},'星期四':{'text':'星期四','type ':'日期','价值':'2022-04-14'}} |
How can I efficiently get a table containing the frequency count of the unique keys of the main dictionary like below?我怎样才能有效地得到一个包含主字典唯一键频率计数的表,如下所示? (In a table because I want to save the result to a CSV.) (在一个表中,因为我想将结果保存到一个 CSV。)
text文本 | count数数 |
---|---|
Yesterday昨天 | 2 2个 |
Thursday周四 | 1 1个 |
Try:尝试:
df = (
df["time_columns"]
.explode()
.value_counts()
.reset_index(name="count")
.rename(columns={"index": "text"})
)
print(df)
Prints:印刷:
text count
0 Yesterday 2
1 Thursday 1
Given the input data, could you try this?给定输入数据,你能试试这个吗?
tmp=pd.concat(([pd.DataFrame.from_dict(v,orient='index') for k,v in df['time_columns'].items()]))
tmp['text'].value_counts()
The easy way would be to just iterate through list and save results to new dictionary sth like:简单的方法是遍历列表并将结果保存到新字典中,例如:
res = {}
for dict in df['time_columns']:
for key in dict.keys():
if key not in res.keys():
res[key] = 1
else:
res[key] += 1
If you know keys in advance you can initialize dict with keys and zeros and replace if statement inside the loop with just increment.如果你提前知道键,你可以用键和零初始化字典,并用增量替换循环内的 if 语句。
keys = ['Yesterday', 'Thursday', 'etc.']
res = {key: 0 for key in keys}
for dict in df['time_columns']:
for key in dict.keys():
res[key] += 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.