简体   繁体   中英

Create DataFrame from list of dicts in Pandas series

I have a pandas series with string data structured like this for each "row":

["[{'id': 240, 'name': 'travolta'}, {'id': 378, 'name': 'suleimani'}, {'id': 730, 'name': 'pearson'}, {'id': 1563, 'name': 'googenhaim'}, {'id': 1787, 'name': 'al_munir'}, {'id': 10183, 'name': 'googenhaim'}, {'id': 13072, 'name': 'vodkin'}]"]

When I use a standard solutions to get a DataFrame I got:

> 0 [{'id': 240, 'name': 'travolta'}, {'id': 378, ...   
> 1 [{'id': 240, m'name': 'suleimani'}, {'id': 378,...

How to make an explicit DataFrame with columns named by dict keys?

You can use json module to load that structure:

import json

data = ["[{'id': 240, 'name': 'travolta'}, {'id': 378, 'name': 'suleimani'}, {'id': 730, 'name': 'pearson'}, {'id': 1563, 'name': 'googenhaim'}, {'id': 1787, 'name': 'al_munir'}, {'id': 10183, 'name': 'googenhaim'}, {'id': 13072, 'name': 'vodkin'}]"]

data = ''.join(data).replace('\'', '"')
data = json.loads(data)
df = pd.DataFrame(data)

#print result df
#    id name
0   240 travolta
1   378 suleimani
2   730 pearson
3   1563    googenhaim
4   1787    al_munir
import pandas
import ast
spam = ["[{'id': 240, 'name': 'travolta'}, {'id': 378, 'name': 'suleimani'}, {'id': 730, 'name': 'pearson'}, {'id': 1563, 'name': 'googenhaim'}, {'id': 1787, 'name': 'al_munir'}, {'id': 10183, 'name': 'googenhaim'}, {'id': 13072, 'name': 'vodkin'}]"]
eggs = ast.literal_eval(spam[0])

df = pandas.DataFrame(eggs)
print(df)

output

      id        name
0    240    travolta
1    378   suleimani
2    730     pearson
3   1563  googenhaim
4   1787    al_munir
5  10183  googenhaim
6  13072      vodkin

as mentioned in my comment, you don't have list of dicts, but single-element list, in which the element is string literal representing list of dicts.

For the input of your example, you could use ast.literal_eval , followed by a flattening of the main list, lst , as follows:

import pandas as pd
import ast

lst = ["[{'id': 240, 'name': 'travolta'}, {'id': 378, 'name': 'suleimani'}, {'id': 730, 'name': 'pearson'}, {'id': 1563, 'name': 'googenhaim'}, {'id': 1787, 'name': 'al_munir'}, {'id': 10183, 'name': 'googenhaim'}, {'id': 13072, 'name': 'vodkin'}]"]

rows = [d for l in [ast.literal_eval(e) for e in lst] for d in l]

frame = pd.DataFrame(rows)
print(frame)

Output

      id        name
0    240    travolta
1    378   suleimani
2    730     pearson
3   1563  googenhaim
4   1787    al_munir
5  10183  googenhaim
6  13072      vodkin

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM