如何将串联的列名称拆分为单独的列？

Question

In order to perform an analysis, I have been provided with a column name which contains specific information about the product, market and distribution. 为了进行分析，我获得了一个列名，其中包含有关产品，市场和分销的特定信息。

The structure of the dataset is as follows: 数据集的结构如下：

Date     Product1|CBA|MKD  Product1|CPA|MKD     Product1|CBA|IHR    Product2|CBA|IHR
2018-11  12                 23                   0                   2

There are a lot of unique column combinations. 有很多独特的列组合。 What I would like to do is to get the following structure: 我想做的就是得到以下结构：

Date      Product    Partner   Market      Quantity
2020-1    Product1   CBA       MKD         11
2020-1    Product1   CPA       MKD         22
2020-1    Product1   CBA       IHR         0
2020-1    Product2   CBA       IHR         1

So, I want to create 3 different columns and populate them with pasted values from the column name. 因此，我想创建3个不同的列，并使用列名称中的粘贴值填充它们。 The quantity column would obviously contain the value of the old concatenated column (that bit I know how to do), the issue is getting the first 3 columns. 数量列显然将包含旧的串联列的值（我知道该位），问题是获取了前3列。

I have tried to do this in pandas by matching strings but I am really stuck. 我试图通过匹配字符串在熊猫中做到这一点，但我真的很困。 I'd appreciate some help, thank you! 我将不胜感激，谢谢！

Answer 1

It looks like you could use pandas.melt 看来您可以使用pandas.melt

df_ = df.melt(id_vars = 'Date', value_name = 'Quantity')
df_[['Product', 'Partner','Market']] = df_.variable.str.split('|', 
                                                             expand = True)\
                                                        .dropna(axis = 1) 
df_.pop('variable')

df_
Out[67]: 
      Date  Quantity   Product Partner Market
0  2018-11        12  Product1     CBA    MKD
1  2018-11        23  Product1     CPA    MKD
2  2018-11         0  Product1     CBA    IHR
3  2018-11         2  Product2     CBA    IHR

Answer 2

Here's another way to do it: 这是另一种方法：

st = df.set_index("Date").stack().reset_index(-1)
res = st["level_1"].str.split("|")
st[["Product","Partner","Market"]] = pd.DataFrame(res.tolist(), index=st.index)
df2 = st.drop("level_1", axis=1).rename({0:"Quantity"}, axis=1)

print(df2)

         Quantity   Product Partner Market
Date
2018-11        12  Product1     CBA    MKD
2018-11        23  Product1     CPA    MKD
2018-11         0  Product1     CBA    IHR
2018-11         2  Product2     CBA    IHR

Answer 3

a = df.melt(id_vars=["Date"],var_name="Product", 
    value_name="Val").dropna(how='any').sort_values('Date')
a['Partner'] = a['Product'].str.split("|").str[1]
a['Market'] = a['Product'].str.split("|").str[-1]
a['Product']= a['Product'].str.split("|").str[0]

如何将串联的列名称拆分为单独的列？

问题描述

3 个解决方案

解决方案1
3 已采纳 2019-08-09 13:58:18

解决方案2
2 2019-08-09 14:33:44

解决方案3
1 2019-08-09 14:07:37

如何将串联的列名称拆分为单独的列？

问题描述

3 个解决方案

解决方案1 3 已采纳 2019-08-09 13:58:18

解决方案2 2 2019-08-09 14:33:44

解决方案3 1 2019-08-09 14:07:37

解决方案1
3 已采纳 2019-08-09 13:58:18

解决方案2
2 2019-08-09 14:33:44

解决方案3
1 2019-08-09 14:07:37