简体   繁体   中英

How to separate nested comma separated column values in pandas data frame using python?

I have dataframe like below

df (Input Data)

ID Status     linkedShipments
12  Active   [{'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '2021121'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211215'}]
32  Expired  [{'SID': 'CHSGI422', 'Code': 'CHSGI421', 'Num': '4024421'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211222'}]
36  Expired  [{'SID': 'CHSGI428', 'Code': 'CHSGI907', 'Num': '4024568'}, {'SID': 'GBDXY556', 'Code': 'GBDXY55', 'Num': '20211333'}]

Expected Output

ID  SID         Code     Num      Status
12  GBDXY551    GBDXY55  2021121  Active
12  GBDXY551    GBDXY55  20211215 Active
32  CHSGI422    CHSGI421 4024421  Expired
32  GBDXY551    GBDXY55  20211222 Expired
36  CHSGI428    CHSGI907 4024568  Expired
36  GBDXY556    GBDXY55  20211333 Expired

**My Current Code**

This works only with one Key and I also want add Status Column to output data frame how can that be done.

#load as dataframe
df = pd.DataFrame(data)

new_data = {} #define new data
#treverse all rows in current data
for index, row in df.iterrows():
    #json only accept double quotes, so convert singal quotes to double quotes
    shipment_dict_list = json.loads(row['linkedShipments'].replace("\'", "\"")) 
    for shipment_dict in shipment_dict_list:
        new_data.setdefault("ID",[]).append(row['ID'])
        for key in shipment_dict:
            new_data.setdefault(key,[]).append(shipment_dict[key])
print(pd.DataFrame(new_data))

This can be achieved with a combination of explode and apply(pd.Series) :

df2 = df.explode('linkedShipments').reset_index(drop = True)
df2.join(df2['linkedShipments'].apply(pd.Series)).drop(columns = 'linkedShipments')

output:


    ID  Status  SID         Code        Num
0   12  Active  GBDXY551    GBDXY55     2021121
1   12  Active  GBDXY551    GBDXY55     20211215
2   32  Expired CHSGI422    CHSGI421    4024421
3   32  Expired GBDXY551    GBDXY55     20211222
4   36  Expired CHSGI428    CHSGI907    4024568
5   36  Expired GBDXY556    GBDXY55     20211333

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM