My data looks like this:
...
A B C
2017-09-18 12:00:00 1.000010 18000 100
2017-09-18 17:00:00 1.000029 13500 400
2017-09-19 12:00:00 1.000025 18000 300
2017-09-19 17:00:00 1.000037 13500 300
...
At 2 distinct times on the same day measures A, B, and C are taken.
I need to collapse every 2 measures/day into a single row (for example, for the first 2 rows):
a weighted average of columns A and B
((A1 * B1) + (A2 * B2)) / (B1 + B2)
an average of column C
(C1 + C2) / 2
My difficulty arises in trying to df.groupby
these adjacent rows, given they have distinct times and the need to perform a custom operation for columns AB, that is different from C.
My expected output would be:
A C
2017-09-18 12:00:00 1.000018143 250
2017-09-19 12:00:00 1.000030143 300
Any pointers would be greatly appreciated.
Check with
df.groupby(df.index.date).apply(lambda x : pd.Series({'A':sum(x['A']*x['B'])/sum(x['B']),'C':(x['C']).mean()}))
A C
2017-09-18 1.000018 250.0
2017-09-19 1.000030 300.0
Or let us do not use apply
t1=df.eval('A*B').groupby(df.index.date).sum()/df.groupby(df.index.date).B.sum()
t2=df.groupby(df.index.date).C.mean()
pd.concat([t1,t2],1)
0 C
2017-09-18 1.000018 250
2017-09-19 1.000030 300
You can vectorize this with groupby
, apply
, and mean
:
def AB_weighted(g):
return (g['A'] * g['B']).sum() / g['B'].sum()
g = df.groupby(df.index.date)
pd.concat([g.apply(AB_weighted), g['C'].mean()], keys=['A', 'C'], axis=1)
A C
2017-09-18 1.000018 250
2017-09-19 1.000030 300
apply
for the first condition, since the groupby calculation uses multiple columns—"A" and "B". mean()
. Another option is computing the product before the groupby
, so we can circumvent the call to apply
(this is a little like @WB second answer) but with one sum
call.
u = df.assign(D=df['A'] * df['B'])[['D', 'B']].groupby(df.index.date).sum()
u['A'] = u.pop('D') / u.pop('B')
u['C'] = df.groupby(df.index.date)['C'].mean()
u
A C
2017-09-18 1.000018 250
2017-09-19 1.000030 300
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.