[英]Create a Pandas Dataframe from nested dict
I have a nested dict with following structure: course_id, nested dict with: 2 recommended courses and number of purchases for every course.我有一个具有以下结构的嵌套字典:course_id,嵌套字典:2 个推荐的课程和每门课程的购买次数。 For example entries of this dict look smth like this:
例如,这个 dict 的条目看起来像这样:
{490: {566: 253, 551: 247},
357: {571: 112, 356: 100},
507: {570: 172, 752: 150}}
I tried this code to make a dataframe from this dict:我尝试使用此代码从该字典中制作 dataframe :
result=pd.DataFrame.from_dict(dicts, orient='index').stack().reset_index()
result.columns=['Course ID','Recommended course','Number of purchases']
This doesn't quite work for me, because I want an output where there will be 5 columns.这对我来说不太适用,因为我想要一个有 5 列的 output。 Course ID, recommended course 1, purchases 1, recommended course 2, purchases 2. Is there any solution for this?
课程ID,推荐课程1,购买1,推荐课程2,购买2。这个有什么解决办法吗? Thanks in advance.
提前致谢。
I would recommend you just re-shape your dictionary then re-create your dataframe, however you're not far off from getting your target output from your current dataframe.我建议你重新塑造你的字典,然后重新创建你的 dataframe,但是你离目标 output 不远了,你现在的 Z6A8064B5DF479455500553C47C55057
we can groupby
and use cumcount
to create our unique column then unstack
and assign our column from the multi index header that was created.我们可以
cumcount
groupby
我们唯一的列,然后从创建的多索引 header 中取消unstack
并分配我们的列。
s1 = result.groupby(['Course ID',
result.groupby(['Course ID']).cumcount() + 1]).first().unstack()
s1.columns = [f"{x}_{y}" for x,y in s1.columns]
Recommended course_1 Recommended course_2 Number of purchases_1 \
Course ID
357 571 356 112.0
490 566 551 253.0
507 570 752 172.0
Number of purchases_2
Course ID
357 100.0
490 247.0
507 150.0
Not an efficient one, but should work in your case:-不是一个有效的,但应该适用于你的情况: -
df = pd.DataFrame([(k,list(v.keys())[0],list(v.values())[0],list(v.keys())[1],list(v.values())[1]) for k,v in a.items()], columns = ['Course ID','Recommended course 1','purchases 1', 'Recommended Course 2', 'purchases 2'])
print(df)
Output:- Output:-
Course ID Recommended course 1 purchases 1 Recommended Course 2 \
0 490 566 253 551
1 357 571 112 356
2 507 570 172 752
purchases 2
0 247
1 100
2 150
You can use itertools chain to convert the nested dict into a flat list of key, value pairs, and store into a dictionary d2
using dictionary comprehension where the keys are the course id, and then proceed with forming the dataframe using pandas.您可以使用 itertools 链将嵌套 dict 转换为键、值对的平面列表,并使用键是课程 ID 的字典理解存储到字典
d2
中,然后使用 pandas 继续形成 dataframe。
import pandas as pd
from itertools import chain
d = {
490: {566: 253, 551: 247},
357: {571: 112, 356: 100},
507: {570: 172, 752: 150}
}
d2 = {k: list(chain.from_iterable(v.items())) for k, v in d.items()}
df = pd.DataFrame.from_dict(d2, orient='index').reset_index()
df.columns = ['id','rec_course1', 'n_purch_1', 'rec_course2', 'n_purch_2']
df df
id rec_course1 n_purch_1 rec_course2 n_purch_2
0 490 566 253 551 247
1 357 571 112 356 100
2 507 570 172 752 150
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.