简体   繁体   中英

Python, Pandas: average every 2 rows together

pretty basic question, but was wondering:

What is the 'proper' way to average every 2 rows together in pandas Dataframe, and thus end up with only half the number of rows?

Note that this is different than the rolling_mean since it reduces the number of entries.

A fast way to do it:

>>> s = pd.Series(range(10))
>>> s
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
>>> ((s + s.shift(-1)) / 2)[::2]
0    0.5
2    2.5
4    4.5
6    6.5
8    8.5

The "proper way" I guess would be something like:

>> a = s.index.values
>>> idx = np.array([a, a]).T.flatten()[:len(a)]
>>> idx
[0 0 1 1 2 2 3 3 4 4]
>>> s.groupby(idx).mean()
0    0.5
2    2.5
4    4.5
6    6.5
8    8.5

But it is ~2x slower and gets worse with increasing size.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM