[英]How to create nested array of array inside a pandas dataframe column
我有 dataframe (df)
如下所示
输入
ShipID CustomerCode
['USWPR04-20210429-S-00001', 'USWPR04-20210429-S-00002','USWPR04-20210429-S-00006'] USWPR04
['MSLPR04-20210429-S-00001', 'MSLPR04-20210429-S-00002'] MSLPR04
我需要创建名为df['LinkID']
新列,它是上述列的嵌套数组。
Output
df['LinkID']
[{ "shipID": "USWPR04-20210429-S-00001", "customerCode": "USWPR04", "shiNumber": "20210429-S-00001" },
{ "shipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00002" },
{ "ShipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00006" }]
[{ "shipID": "MSLPR04-20210429-S-00001", "customerCode": "MSLPR04", "shiNumber": "20210429-S-00001" },
{ "shipID": "MSLPR04-20210429-S-00002", "customerCode": "MSLPR04", "shipNumber": "20210429-S-00002" }]
最终 Dataframe Output
ShipID CustomerCode link
['USWPR04-20210429-S-00001', 'USWPR04-20210429-S-00002','USWPR04-20210429-S-00006'] USWPR04 [{ "shipID": "USWPR04-20210429-S-00001", "customerCode": "USWPR04", "shiNumber": "20210429-S-00001" },{ "shipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00002" },{ "ShipID": "USWPR04-20210429-S-00002", "customerCode": "USWPR04", "shipNumber": "20210429-S-00006" }]
['MSLPR04-20210429-S-00001', 'MSLPR04-20210429-S-00002'] MSLPR04 [{ "shipID": "MSLPR04-20210429-S-00001", "customerCode": "MSLPR04", "shiNumber": "20210429-S-00001" },{ "shipID": "MSLPR04-20210429-S-00002", "customerCode": "MSLPR04", "shipNumber": "20210429-S-00002" }]
如何才能做到这一点?
更新的答案:
脚步:
eval
。ShipID
。.str.split
方法提取shipNumber
。to_dict('records')
并再次将其加载到 dataframe 中。groupby
和agg
使用list
将其转换回原始结构。# df.ShipID = df.ShipID.apply(eval)
df2 = df.explode('ShipID')
df2['shipNumber'] = df2.ShipID.str.split('-',1).str[-1]
df2['link'] = pd.DataFrame({'link': df2.to_dict('records')})
df['link'] = df2.groupby(df2.index).agg(list)['link']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.