简体   繁体   中英

Create a Pandas Dataframe from nested dict

I have a nested dict with following structure: course_id, nested dict with: 2 recommended courses and number of purchases for every course. For example entries of this dict look smth like this:

 {490: {566: 253, 551: 247},
 357: {571: 112, 356: 100},
 507: {570: 172, 752: 150}}

I tried this code to make a dataframe from this dict:

result=pd.DataFrame.from_dict(dicts, orient='index').stack().reset_index()
result.columns=['Course ID','Recommended course','Number of purchases']

请。查看输出

This doesn't quite work for me, because I want an output where there will be 5 columns. Course ID, recommended course 1, purchases 1, recommended course 2, purchases 2. Is there any solution for this? Thanks in advance.

I would recommend you just re-shape your dictionary then re-create your dataframe, however you're not far off from getting your target output from your current dataframe.

we can groupby and use cumcount to create our unique column then unstack and assign our column from the multi index header that was created.

s1 = result.groupby(['Course ID',
             result.groupby(['Course ID']).cumcount() + 1]).first().unstack()

s1.columns = [f"{x}_{y}" for x,y in s1.columns]


              Recommended course_1  Recommended course_2  Number of purchases_1  \
Course ID                                                                      
357                         571                   356                  112.0   
490                         566                   551                  253.0   
507                         570                   752                  172.0   

           Number of purchases_2  
Course ID                         
357                        100.0  
490                        247.0  
507                        150.0

Not an efficient one, but should work in your case:-

df = pd.DataFrame([(k,list(v.keys())[0],list(v.values())[0],list(v.keys())[1],list(v.values())[1]) for k,v in a.items()], columns = ['Course ID','Recommended course 1','purchases 1', 'Recommended Course 2', 'purchases 2'])
print(df)

Output:-

   Course ID  Recommended course 1  purchases 1  Recommended Course 2  \
0        490                   566          253                   551
1        357                   571          112                   356
2        507                   570          172                   752

   purchases 2
0          247
1          100
2          150

You can use itertools chain to convert the nested dict into a flat list of key, value pairs, and store into a dictionary d2 using dictionary comprehension where the keys are the course id, and then proceed with forming the dataframe using pandas.

import pandas as pd
from itertools import chain

d = {
    490: {566: 253, 551: 247},
    357: {571: 112, 356: 100},
    507: {570: 172, 752: 150}
}

d2 = {k: list(chain.from_iterable(v.items())) for k, v in d.items()}
df = pd.DataFrame.from_dict(d2, orient='index').reset_index()
df.columns = ['id','rec_course1', 'n_purch_1', 'rec_course2', 'n_purch_2']

df

    id   rec_course1  n_purch_1  rec_course2  n_purch_2
0  490           566        253          551        247
1  357           571        112          356        100
2  507           570        172          752        150

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM