简体   繁体   English

如何使用 python 分隔 pandas 数据帧中的嵌套逗号分隔列值?

[英]How to separate nested comma separated column values in pandas data frame using python?

I have dataframe like below我有 dataframe 如下所示

df (Input Data) df(输入数据)

ID Status     linkedShipments
12  Active   [{'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '2021121'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211215'}]
32  Expired  [{'SID': 'CHSGI422', 'Code': 'CHSGI421', 'Num': '4024421'}, {'SID': 'GBDXY551', 'Code': 'GBDXY55', 'Num': '20211222'}]
36  Expired  [{'SID': 'CHSGI428', 'Code': 'CHSGI907', 'Num': '4024568'}, {'SID': 'GBDXY556', 'Code': 'GBDXY55', 'Num': '20211333'}]

Expected Output预计 Output

ID  SID         Code     Num      Status
12  GBDXY551    GBDXY55  2021121  Active
12  GBDXY551    GBDXY55  20211215 Active
32  CHSGI422    CHSGI421 4024421  Expired
32  GBDXY551    GBDXY55  20211222 Expired
36  CHSGI428    CHSGI907 4024568  Expired
36  GBDXY556    GBDXY55  20211333 Expired

**My Current Code**

This works only with one Key and I also want add Status Column to output data frame how can that be done.这仅适用于一个键,我还想将状态列添加到 output 数据帧如何做到这一点。

#load as dataframe
df = pd.DataFrame(data)

new_data = {} #define new data
#treverse all rows in current data
for index, row in df.iterrows():
    #json only accept double quotes, so convert singal quotes to double quotes
    shipment_dict_list = json.loads(row['linkedShipments'].replace("\'", "\"")) 
    for shipment_dict in shipment_dict_list:
        new_data.setdefault("ID",[]).append(row['ID'])
        for key in shipment_dict:
            new_data.setdefault(key,[]).append(shipment_dict[key])
print(pd.DataFrame(new_data))

This can be achieved with a combination of explode and apply(pd.Series) :这可以通过组合explodeapply(pd.Series)来实现:

df2 = df.explode('linkedShipments').reset_index(drop = True)
df2.join(df2['linkedShipments'].apply(pd.Series)).drop(columns = 'linkedShipments')

output: output:


    ID  Status  SID         Code        Num
0   12  Active  GBDXY551    GBDXY55     2021121
1   12  Active  GBDXY551    GBDXY55     20211215
2   32  Expired CHSGI422    CHSGI421    4024421
3   32  Expired GBDXY551    GBDXY55     20211222
4   36  Expired CHSGI428    CHSGI907    4024568
5   36  Expired GBDXY556    GBDXY55     20211333

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将 pandas 数据帧内的嵌套逗号分隔列转换为 Python 中的特定格式 - How to convert nested comma separated column inside a pandas data frame to specific format in Python 将 pandas 数据框列值转换为逗号分隔的字符串 - convert pandas data frame column values into comma separated strings 如何查找存储在 pandas 数据框列中的逗号分隔字符串中唯一值的数量? - How to find the number of unique values in comma separated strings stored in an pandas data frame column? 如何在 excel 的一个单元格中插入 pandas 数据帧(使用 openpyxl),其中的值将用逗号分隔? - How to insert a pandas data frame in one cell in excel (using openpyxl), where the values will be separated with comma? 包含可变长度和逗号分隔的值字符串的熊猫行列如何堆叠成单独的值? - How is a pandas column of rows containing variable length and comma separated strings of values, stacked into separate values? Pandas:根据另一个数据框列中的值范围计算单独数据框列框中的值(python) - Pandas: Calculating a value in a separate data frame column frame based on range of values in another data frame column (python) 如何从 pandas 数据帧中的逗号分隔值计算以特定 substring 开头的字符串的出现次数? - How to count the occurrences of a string starts with a specific substring from comma separated values in a pandas data frame? 如何使用 pandas 根据列的值范围分隔数据框? - How to separate a data frame based on a column's range of values with pandas? groupby逗号分隔值在单个DataFrame列python / pandas中 - groupby comma-separated values in single DataFrame column python/pandas Python Pandas为逗号分隔的值提供新列 - Python pandas give comma separated values new column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM