![](/img/trans.png)
[英]How to convert nested comma separated column inside a pandas data frame to specific format in Python
[英]How to separate nested comma separated column values in pandas data frame using python?
我有 dataframe 如下所示
df(輸入數據)
ID Status linkedShipments
12 Active [{'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '2021121'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211215'}]
32 Expired [{'SID': 'CHSGI422', 'Code': 'CHSGI421', 'Num': '4024421'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211222'}]
36 Expired [{'SID': 'CHSGI428', 'Code': 'CHSGI907', 'Num': '4024568'}, {'SID': 'GBDXY556', 'Code': 'GBDXY55', 'Num': '20211333'}]
預計 Output
ID SID Code Num Status
12 GBDXY551 GBDXY55 2021121 Active
12 GBDXY551 GBDXY55 20211215 Active
32 CHSGI422 CHSGI421 4024421 Expired
32 GBDXY551 GBDXY55 20211222 Expired
36 CHSGI428 CHSGI907 4024568 Expired
36 GBDXY556 GBDXY55 20211333 Expired
**My Current Code**
這僅適用於一個鍵,我還想將狀態列添加到 output 數據幀如何做到這一點。
#load as dataframe
df = pd.DataFrame(data)
new_data = {} #define new data
#treverse all rows in current data
for index, row in df.iterrows():
#json only accept double quotes, so convert singal quotes to double quotes
shipment_dict_list = json.loads(row['linkedShipments'].replace("\'", "\""))
for shipment_dict in shipment_dict_list:
new_data.setdefault("ID",[]).append(row['ID'])
for key in shipment_dict:
new_data.setdefault(key,[]).append(shipment_dict[key])
print(pd.DataFrame(new_data))
這可以通過組合explode
和apply(pd.Series)
來實現:
df2 = df.explode('linkedShipments').reset_index(drop = True)
df2.join(df2['linkedShipments'].apply(pd.Series)).drop(columns = 'linkedShipments')
output:
ID Status SID Code Num
0 12 Active GBDXY551 GBDXY55 2021121
1 12 Active GBDXY551 GBDXY55 20211215
2 32 Expired CHSGI422 CHSGI421 4024421
3 32 Expired GBDXY551 GBDXY55 20211222
4 36 Expired CHSGI428 CHSGI907 4024568
5 36 Expired GBDXY556 GBDXY55 20211333
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.