简体   繁体   中英

separate dict from list in pandas dataframe column into different dataframe columns

[{
"name":"game_time",
"type":"int",
"info":"millisecond count since start of game"
},
{
"name":"round",
"type":"int",
"info":"number of the current round when the even takes place or 0 if no round"
}]

my try:

specs: Dataframe containing args column the file sample is attached below

specs['args'].apply(lambda x : x.split('},{')).to_frame()['args'].apply(pd.Series).apply(lambda x : x.str[2:])
specs['args'].apply(pd.Series)

sample file

I hope that the ast will help you in this case.Here is solution

one version of result

import pandas as pd
from ast import literal_eval

df = pd.read_csv('test_.csv', header = None)
df

Out[1]:

           0
    0   [{"name":"game_time","type":"int","info":"mill...
    1   [{"name":"game_time","type":"int","info":"mill...
    2   [{"name":"game_time","type":"int","info":"mill...
    3   [{"name":"game_time","type":"int","info":"mill..

lst = [m for s in df[0] for m in literal_eval(s)]
lst

Out[2]:

[{'name': 'game_time',
  'type': 'int',
  'info': 'millisecond count since start of game'},
 {'name': 'round',
  'type': 'int',
  'info': 'number of the current round when the event takes place or 0  if no round'},
 {'name': 'level',
  'type': 'int',
  'info': 'number of the current level when the event takes place or 0     if no level'},
 {'name': 'description',.......


pd.DataFrame.from_dict(lst)

Out[3]:

                                                     info   name        type
    0   millisecond count since start of game               game_time   int
    1   number of the current round when the event tak...   round       int
    2   number of the current level when the event tak...   level       int
    3   the text or description of the instruction          description string
    ........

is it your desired result?

another version of result

if you desire the same output like in your code, here is the example

lst1 = [literal_eval(s) for s in df[0]]
pd.DataFrame(lst1)

Just use the dataframe constructor

data = [{
"name":"game_time",
"type":"int",
"info":"millisecond count since start of game"
},
{
"name":"round",
"type":"int",
"info":"number of the current round when the even takes place or 0 if no round"
}]

print(pd.DataFrame(data))

out:

                                                info       name type
0              millisecond count since start of game  game_time  int
1  number of the current round when the even take...      round  int

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM