My dataframe has a nested column (people_info) that contains cells like the sample below.
[{"institution":"some_institution","startMonth":1,"startYear":2563,"course":"any","id":1111,"formation":"any","endMonth":12,"endYear":2556,"status":"complete"}]
As far I know this can be solved using dictionary/json concepts.
I'm trying to split this column in new columns, considering that each key of this nested cell will be a new column with their respective values.
I tried json_normalize, but I'm getting this error: "AttributeError: 'str' object has no attribute 'values'"
I tried to transform those cells in a dict, but I never was able to make python understand that "institution" is a key and "some_institution" is a value in this created dict. It's seems python understand the whole cell as a string.
Can you help me? If I wasn't clear, please tell me. Tks!
IIUC, the following should work:
Input
df = pd.DataFrame({'col1':[1], 'col2':2, 'nested_column':'[{"institution":"some_institution","startMonth":1,"startYear":2563,"course":"any","id":1111,"formation":"any","endMonth":12,"endYear":2556,"status":"complete"}]'})
df
col1 col2 nested_column
0 1 2 [{"institution":"some_institution","startMonth...
Process
import json
df['nested_column_dict'] = df['nested_column'].transform(lambda x : json.loads(x)[0] if x is not np.nan else {})
df = pd.concat([df, pd.DataFrame.from_records(df['nested_column_dict'])], axis=1)
df.drop('nested_column_dict', axis=1, inplace=True)
Output
df
col1 col2 nested_column institution startMonth startYear course id formation endMonth endYear status
0 1 2 [{"institution":"some_institution","startMonth... some_institution 1 2563 any 1111 any 12 2556 complete
Maybe this helps.
import pandas as pd
data = [{"institution":"some_institution", "startMonth":1, "startYear":2563, "course":"any", "id":1111, "formation":"any", "endMonth":12, "endYear":2556, "status":"complete"}]
l = next(item for item in data)
df = pd.DataFrame(l, index=[0])
df
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.