简体   繁体   English

蟒蛇。 合并重复的列

[英]Python. Merge repeated columns

I have to create a dataframe from a file that contains some columns repeated and their values split as it follows: 我必须从包含重复一些列的文件创建一个dataframe ,并按如下所示拆分它们的值:

在此处输入图片说明

As you can see c1 for example is split into 3 parts or c2 into 2 如您所见,例如c1分为3部分或c2分为2

What i want to get it is something like: 我想要得到的是这样的:

在此处输入图片说明

I know that i can merge the columns by: 我知道我可以通过合并列:

df.sum(index=1) or df.max(index=1)

but don't know how to specify that I want to do it with specific columns. 但不知道如何指定要对特定列执行的操作。
Another possibility could be to create dataframes with only the repeated columns, apply either sum or max and then merge everything. 另一种可能性是创建仅包含重复列的数据框,应用sum或max,然后合并所有内容。

But I was wondering if there is something less "ugly". 但是我想知道是否还有一些“丑陋”的东西。

In a much more simple fashion, you can use groupby for that. 您可以通过一种更为简单的方式使用groupby。

In [1]: df = pd.DataFrame(np.random.random_integers(0,10,(5,8)), columns=['C1','C2','C3','C1','C4','C1','C5','C2'])

In [2]: df
Out[2]:
    C1  C2  C3  C1  C4  C1  C5  C2
0   5   0   9   1   7   3   3   8
1   3   1   10  7   1   2   3   8
2   1   0   0   0   4   10  6   10

In [3]: # Groupby level 0 on axis 1 (columns) and apply a sum
df.groupby(level=0, axis=1).sum()

Out[3]:
    C1  C2  C3  C4  C5
0   9   8   9   7   3
1   12  9   10  1   3
2   11  10  0   4   6

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM