Source of DataFrame is a list of dicts like -
ls = [{'fileName': 'file_01', 'col1': {'key1': 'value1a', 'key2': 'value1b'}}, {'fileName': 'file_02', 'col1': {'key1': 'value2a', 'key2': 'value2b', 'key3':'value2c'}}, {'fileName': 'file_03', 'col1': {'key1': 'value3a', 'key3': 'value3c'}}]
DataFrame created asdf = pd.DataFrame(ls, columns=['fileName', 'col1'])
Pandas DataFrame df
looks like -
fileName col1
file_01 {'key1':value1a, 'key2':value1b}
file_02 {'key1':value2a, 'key2':value2b, 'key3':value2c}
file_03 {'key1':value3a, 'key3':value3c}
How can I convert this to look like -
fileName key1 key2 key3
file_01 value1a value1b
file_02 value2a value2b value2c
file_03 value3a value3c
I tried -
df = pd.concat([df['fileName'], pd.get_dummies(df['col1'].apply(pd.Series))], axis=1)
I see results in some cases like -
fileName key1_value1a key1_value2a key1_value3a
file_01 value1a
file_02 value2a
file_03 value3a
Use pd.json_normalize()
:
In [40]: pd.concat([df['fileName'], pd.json_normalize(df['col1'])],axis=1)
Out[40]:
fileName key1 key2 key3
0 file_01 value1a value1b NaN
1 file_02 value2a value2b value2c
2 file_03 value3a NaN value3c
Can you try the following:
df1 = pd.concat([df[['fileName']], pd.DataFrame(df['col1'].to_list())], axis=1)
The above will work if {'key1':value1a, 'key2':value1b}, {'key1':value2a, 'key2':value2b, 'key3':value2c}, ...
are of type dict
This solution will also work, but the solution provided by @Vorsprung looks nice.
You can try the following solution:
df1 = pd.concat([df['fileName'], df['col1'].apply(pd.Series)], axis=1)
df['col1'].apply(pd.Series)
split dict into seperate columns.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.