简体   繁体   English

如何获得 pandas dataframe 中的 k 个连续行的总和?

[英]How to get sum of k consecutive rows in pandas dataframe?

I have this pandas dataframe:我有这个 pandas dataframe:

ts = pd.Series([1, 2, 3, 4, 5, 6, 7, 8])

What I would like to get is a dataframe which contains another column with the sum of four consecutive rows of ts .我想得到的是 dataframe ,其中包含另一列,其中包含四行连续ts的总和。 The sum should be put in every row of the consecutive rows.总和应放在连续行的每一行中。

In this case, this means a new dataframe should look like this one:在这种情况下,这意味着新的 dataframe 应该如下所示:

index ts sum
0 1 10
1 2 10
2 3 10
3 4 10
4 5 26
5 6 26
6 7 26
7 8 26

How could I do this?我怎么能这样做?

Use GroupBy.transform with integers division by k :使用GroupBy.transform整数除以k

k = 4
a = ts.groupby(ts.index // k).transform('sum')
#alternative if not RangeIndex
#a = ts.groupby(np.arange(len(ts)) // k).transform('sum')
print (a)
0    10
1    10
2    10
3    10
4    26
5    26
6    26
7    26
dtype: int64
    

If need DataFrame with 2 columns add Series.to_frame :如果需要 2 列的 DataFrame 添加Series.to_frame

df = ts.to_frame('ts')
df['sum'] = df.groupby(ts.index // k).transform('sum')
print (df)
   ts  sum
0   1   10
1   2   10
2   3   10
3   4   10
4   5   26
5   6   26
6   7   26
7   8   26
    

You could also try this list comprehension:你也可以试试这个列表理解:

ts = pd.Series([1, 2, 3, 4, 5, 6, 7, 8])
sum_ = sum([[sum(ts[i:i + 4])] * 4 for i in range(0, len(ts), 4)], [])
df = pd.DataFrame({'ts': ts, 'sum': sum_})
print(df)

Output: Output:

   sum  ts
0   10   1
1   10   2
2   10   3
3   10   4
4   26   5
5   26   6
6   26   7
7   26   8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM