简体   繁体   中英

Sum of Every Two Columns in Pandas dataframe

When I am using Pandas, I have a problem. My task is like this:

df=pd.DataFrame([(1,2,3,4,5,6),(1,2,3,4,5,6),(1,2,3,4,5,6)],columns=['a','b','c','d','e','f'])
Out:
    a b c d e f
0   1 2 3 4 5 6
1   1 2 3 4 5 6 
2   1 2 3 4 5 6

what I want to do is the output dataframe looks like this:

Out:
    s1   s2   s3
0   3    7    11
1   3    7    11
2   3    7    11

That is to say, sum the column (a,b),(c,d),(e,f) separately and rename the result columns names as (s1,s2,s3). Could anyone help solve this problem in Pandas? Thank you so much.

1) Perform groupby wrt columns by supplying axis=1 . Per @Boud's comment, you exactly get what you want with a minor tweak in the grouping array:

df.groupby((np.arange(len(df.columns)) // 2) + 1, axis=1).sum().add_prefix('s')

在此输入图像描述

Grouping gets performed according to this condition:

np.arange(len(df.columns)) // 2
# array([0, 0, 1, 1, 2, 2], dtype=int32)

2) Use np.add.reduceat which is a faster alternative:

df = pd.DataFrame(np.add.reduceat(df.values, np.arange(len(df.columns))[::2], axis=1))
df.columns = df.columns + 1
df.add_prefix('s')

在此输入图像描述

Timing Constraints:

For a DF of 1 million rows spanned over 20 columns:

from string import ascii_lowercase
np.random.seed(42)
df = pd.DataFrame(np.random.randint(0, 10, (10**6,20)), columns=list(ascii_lowercase[:20]))
df.shape
(1000000, 20)

def with_groupby(df):
    return df.groupby((np.arange(len(df.columns)) // 2) + 1, axis=1).sum().add_prefix('s')

def with_reduceat(df):
    df = pd.DataFrame(np.add.reduceat(df.values, np.arange(len(df.columns))[::2], axis=1))
    df.columns = df.columns + 1
    return df.add_prefix('s')

# test whether they give the same o/p
with_groupby(df).equals(with_groupby(df))
True

%timeit with_groupby(df.copy())
1 loop, best of 3: 1.11 s per loop

%timeit with_reduceat(df.copy())   # <--- (>3X faster)
1 loop, best of 3: 345 ms per loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM