I have to create a dataframe
from a file that contains some columns repeated and their values split as it follows:
As you can see c1
for example is split into 3 parts or c2
into 2
What i want to get it is something like:
I know that i can merge the columns by:
df.sum(index=1) or df.max(index=1)
but don't know how to specify that I want to do it with specific columns.
Another possibility could be to create dataframes with only the repeated columns, apply either sum or max and then merge everything.
But I was wondering if there is something less "ugly".
In a much more simple fashion, you can use groupby for that.
In [1]: df = pd.DataFrame(np.random.random_integers(0,10,(5,8)), columns=['C1','C2','C3','C1','C4','C1','C5','C2'])
In [2]: df
Out[2]:
C1 C2 C3 C1 C4 C1 C5 C2
0 5 0 9 1 7 3 3 8
1 3 1 10 7 1 2 3 8
2 1 0 0 0 4 10 6 10
In [3]: # Groupby level 0 on axis 1 (columns) and apply a sum
df.groupby(level=0, axis=1).sum()
Out[3]:
C1 C2 C3 C4 C5
0 9 8 9 7 3
1 12 9 10 1 3
2 11 10 0 4 6
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.