I have a nested dict with following structure: course_id, nested dict with: 2 recommended courses and number of purchases for every course. For example entries of this dict look smth like this:
{490: {566: 253, 551: 247},
357: {571: 112, 356: 100},
507: {570: 172, 752: 150}}
I tried this code to make a dataframe from this dict:
result=pd.DataFrame.from_dict(dicts, orient='index').stack().reset_index()
result.columns=['Course ID','Recommended course','Number of purchases']
This doesn't quite work for me, because I want an output where there will be 5 columns. Course ID, recommended course 1, purchases 1, recommended course 2, purchases 2. Is there any solution for this? Thanks in advance.
I would recommend you just re-shape your dictionary then re-create your dataframe, however you're not far off from getting your target output from your current dataframe.
we can groupby
and use cumcount
to create our unique column then unstack
and assign our column from the multi index header that was created.
s1 = result.groupby(['Course ID',
result.groupby(['Course ID']).cumcount() + 1]).first().unstack()
s1.columns = [f"{x}_{y}" for x,y in s1.columns]
Recommended course_1 Recommended course_2 Number of purchases_1 \
Course ID
357 571 356 112.0
490 566 551 253.0
507 570 752 172.0
Number of purchases_2
Course ID
357 100.0
490 247.0
507 150.0
Not an efficient one, but should work in your case:-
df = pd.DataFrame([(k,list(v.keys())[0],list(v.values())[0],list(v.keys())[1],list(v.values())[1]) for k,v in a.items()], columns = ['Course ID','Recommended course 1','purchases 1', 'Recommended Course 2', 'purchases 2'])
print(df)
Output:-
Course ID Recommended course 1 purchases 1 Recommended Course 2 \
0 490 566 253 551
1 357 571 112 356
2 507 570 172 752
purchases 2
0 247
1 100
2 150
You can use itertools chain to convert the nested dict into a flat list of key, value pairs, and store into a dictionary d2
using dictionary comprehension where the keys are the course id, and then proceed with forming the dataframe using pandas.
import pandas as pd
from itertools import chain
d = {
490: {566: 253, 551: 247},
357: {571: 112, 356: 100},
507: {570: 172, 752: 150}
}
d2 = {k: list(chain.from_iterable(v.items())) for k, v in d.items()}
df = pd.DataFrame.from_dict(d2, orient='index').reset_index()
df.columns = ['id','rec_course1', 'n_purch_1', 'rec_course2', 'n_purch_2']
df
id rec_course1 n_purch_1 rec_course2 n_purch_2
0 490 566 253 551 247
1 357 571 112 356 100
2 507 570 172 752 150
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.