I have dataframe like below
df (Input Data)
ID Status linkedShipments
12 Active [{'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '2021121'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211215'}]
32 Expired [{'SID': 'CHSGI422', 'Code': 'CHSGI421', 'Num': '4024421'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211222'}]
36 Expired [{'SID': 'CHSGI428', 'Code': 'CHSGI907', 'Num': '4024568'}, {'SID': 'GBDXY556', 'Code': 'GBDXY55', 'Num': '20211333'}]
Expected Output
ID SID Code Num Status
12 GBDXY551 GBDXY55 2021121 Active
12 GBDXY551 GBDXY55 20211215 Active
32 CHSGI422 CHSGI421 4024421 Expired
32 GBDXY551 GBDXY55 20211222 Expired
36 CHSGI428 CHSGI907 4024568 Expired
36 GBDXY556 GBDXY55 20211333 Expired
**My Current Code**
This works only with one Key and I also want add Status Column to output data frame how can that be done.
#load as dataframe
df = pd.DataFrame(data)
new_data = {} #define new data
#treverse all rows in current data
for index, row in df.iterrows():
#json only accept double quotes, so convert singal quotes to double quotes
shipment_dict_list = json.loads(row['linkedShipments'].replace("\'", "\""))
for shipment_dict in shipment_dict_list:
new_data.setdefault("ID",[]).append(row['ID'])
for key in shipment_dict:
new_data.setdefault(key,[]).append(shipment_dict[key])
print(pd.DataFrame(new_data))
This can be achieved with a combination of explode
and apply(pd.Series)
:
df2 = df.explode('linkedShipments').reset_index(drop = True)
df2.join(df2['linkedShipments'].apply(pd.Series)).drop(columns = 'linkedShipments')
output:
ID Status SID Code Num
0 12 Active GBDXY551 GBDXY55 2021121
1 12 Active GBDXY551 GBDXY55 20211215
2 32 Expired CHSGI422 CHSGI421 4024421
3 32 Expired GBDXY551 GBDXY55 20211222
4 36 Expired CHSGI428 CHSGI907 4024568
5 36 Expired GBDXY556 GBDXY55 20211333
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.