简体   繁体   中英

How to convert a defaultdict(list) to Pandas DataFrame

I have a defaultdict(list) object that is of this structure:

{id: [list[list]]}

for example,

'a1': [[0.01, 'cat']],

'a2': [[0.09, 'cat']],

'a3': [[0.5, 'dog']],

...

I'd like to conver this defaultdict(list) into a Pandas DataFrame object.

I tried with the following:

df = pd.DataFrame(list(my_dict.items()), columns=['id', 'category'])

However, I faced a problem with my 'category' column. This is a column of list of list. I'm trying to split out the 2 values in the 'category' into 2 separate columns. So my final DataFrame columns would be ['id', 'score', 'category'].

When I tried with below Apply function:

db['category'].apply(lambda x: x[0][0])

I got an error for 'list index out of range'.

What could be wrong with my code? How shall I create the 2 new columns from a list of lists?

Thank you.

I believe you need:

df = pd.DataFrame([[k] + v[0] for k, v in my_dict.items()], 
                   columns=['id', 'score', 'category'])

Or:

df = pd.DataFrame([(k, v[0][0], v[0][1]) for k, v in my_dict.items()], 
                   columns=['id', 'score', 'category'])

Using a list comprehension

Ex:

import pandas as pd
d = {'a1': [[0.01, 'cat']], 'a2': [[0.09, 'cat']],'a3': [[0.5, 'dog']]}


df = pd.DataFrame([[k] + j for k,v in d.items() for j in v], columns=['id', 'score', 'category'])
print(df)

Output:

   id  score category
0  a1   0.01      cat
1  a3   0.50      dog
2  a2   0.09      cat

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM