简体   繁体   中英

convert nested dictionary into pandas dataframe

example dictionary:

sample_dict = {'doctor': {'docter_a': 26, 'docter_b': 40, 'docter_c': 42}, 
               'teacher': {'teacher_x': 21, 'teacher_y': 45, 'teacher_z': 33}}

output dataframe:

job     person     age
doctor |doctor_a | 26
doctor |doctor_b | 40
doctor |doctor_c | 42
teacher|teacher_x| 21
teacher|teacher_y| 45
teacher|teacher_z| 33

I have tried:

df = pd.dataFrame.from_dict(sample_dict)

=>

             doctor      teacher
doctor_a  |  26      |   Nah
doctor_b  |  40      |   Nah
doctor_c  |  42      |   Nah
teacher_x |  Nah     |   21
teacher_y |  Nah     |   45
teacher_z |  Nah     |   33

Could someone help me figure this out?

Use a nested list comprehension:

pd.DataFrame([[k1, k2, v]
              for k1,d in sample_dict.items() 
              for k2,v in d.items()],
             columns=['job', 'person', 'age'])

Output:

       job     person  age
0   doctor   docter_a   26
1   doctor   docter_b   40
2   doctor   docter_c   42
3  teacher  teacher_x   21
4  teacher  teacher_y   45
5  teacher  teacher_z   33

You can construct a zip of length 3 elements, and feed them to pd.DataFrame after reshaping:

zip_list = [list(zip([key]*len(sample_dict['doctor']), 
                 sample_dict[key], 
                 sample_dict[key].values())) 
            for key in sample_dict.keys()]

col_len = len(sample_dict['doctor']) # or use any other valid key
output = pd.DataFrame(np.ravel(zip_list).reshape(col_len**2, col_len))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM