简体   繁体   English

使用熊猫组合列集

[英]Combining sets of columns using pandas

I have the following data frame structure:我有以下数据帧结构:

    SC0 Shape   S1  S2  S3  C1  C2  C3  D1  D2  D3
2   1   Circle  NaN NaN NaN 1   1   1   NaN NaN NaN
3   13  Square  2   1   2   NaN NaN NaN NaN NaN NaN
4   13  Diamond NaN NaN NaN NaN NaN NaN 2   1   2
5   16  Diamond NaN NaN NaN NaN NaN NaN 2   2   2
6   16  Square  2   2   2   NaN NaN NaN NaN NaN NaN

How can I combine S1,S2,S3 with C1,C2,C3,D1,D2,D3 so S1,C1 and D1 are on the same column, S2,C2 and D2...(all the way to S16,C16 and D16)?如何将 S1、S2、S3 与 C1、C2、C3、D1、D2、D3 结合起来,使 S1、C1 和 D1 在同一列上,S2、C2 和 D2...(一直到 S16、C16 和D16)?

When Shape = Circle the populated columns are C1-C16, when Shape = Square its S1-S16 and for Shape = Diamond its D1-D16.当形状 = 圆形时,填充的列是 C1-C16,当形状 = 方形时,其 S1-S16 和形状 = 菱形时,其 D1-D16。

I don't mind creating a new set of columns or copy two of them to to an existing set, as long as I have all the #1 scores in the same column, #2 same column etc.我不介意创建一组新的列或将其中的两个复制到现有的集合中,只要我在同一列中拥有所有 #1 分数,#2 相同列等。

Thank you!谢谢!

IIUC you have an equal amount of columns for each category, and you want to compress this into numeric columns which are shape agnostic. IIUC 每个类别都有相同数量的列,并且您希望将其压缩为形状不可知的数字列。 If so this will work:如果是这样,这将起作用:

dfs = []
for var in ['S', 'D', 'C']:
        # filter  columns with a regex
        res = df[df.iloc[:, 2:].filter(regex= var + '\d{1,2}').columns].dropna()
        # rename coumns with just numbers to enable concatenation
        res.columns = range(3)
        dfs.append(res)

df = pd.concat([df.iloc[:, :2], pd.concat(dfs)], 1)
print(df)

Output:输出:

   SC0  Shape        0      1       2
2   1   Circle      1.0     1.0     1.0
3   13  Square      2.0     1.0     2.0
4   13  Diamond     2.0     1.0     2.0
5   16  Diamond     2.0     2.0     2.0
6   16  Square      2.0     2.0     2.0

Try:尝试:

n=3
cols_prefixes=["C", "S", "D"]
for i in range(n):
    cols=[f"{l}{i+1}" for l in cols_prefixes]
    df[f"res{i+1}"]=df[cols].bfill(axis=1).iloc[:,0]
    df=df.drop(columns=cols)

Outputs:输出:

   SC0    Shape  res1  res2  res3
2    1   Circle   1.0   1.0   1.0
3   13   Square   2.0   1.0   2.0
4   13  Diamond   2.0   1.0   2.0
5   16  Diamond   2.0   2.0   2.0
6   16   Square   2.0   2.0   2.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM