简体   繁体   English

Pandas DataFrame 按列值组合行,其中行可以有 NaN

[英]Pandas DataFrame combine rows by column value, where rows can have NaNs

I have a Pandas DataFrame like the following:我有一个 Pandas DataFrame ,如下所示:

            timestamp       A       B       C       D       E       F
0           1607594400000   83.69   NaN     NaN     NaN     1003.20 8.66
1           1607594400000   NaN     2.57    44.35   17.18   NaN     NaN
2           1607595000000   83.07   NaN     NaN     NaN     1003.32 8.68
3           1607595000000   NaN     3.00    42.31   20.08   NaN     NaN
..          ...             ...     ...     ...     ...     ...     ...
325         1607691600000   90.19   NaN     NaN     NaN     997.32  10.22
326         1607691600000   NaN     1.80    30.10   14.85   NaN     NaN
328         1607692200000   NaN     1.60    26.06   12.78   NaN     NaN
327         1607692200000   91.33   NaN     NaN     NaN     997.52  10.21

I need to combine the rows that have the same value for timestamp, where in the cases where there is nan-value the value is maintained and in the cases where there is value-value the average of the values is calculated.我需要合并具有相同时间戳值的行,其中在存在nan-value的情况下保持该值,在存在value-value的情况下计算值的平均值。

I tried the solution of the following question but it is not exactly my situation and I don't know how to addapt it: pandas, combine rows based on certain column values and NAN我尝试了以下问题的解决方案,但这不完全是我的情况,我不知道如何添加它: pandas, combine rows based on certain column values and NAN

Just use groupby :只需使用groupby

df.groupby('timestamp', as_index=False).mean()

Try with first , it will pick the not null value for each group first尝试,它将为每个组选择 not null 值

out = df.groupby('timestamp', as_index=False).first()

Or或者

out = df.set_index('timestamp').mean(level=0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM