[英]pandas: delete a row having same value of a column as in previous raw
Here is my df:这是我的df:
data = { 'utime': [1461098442,1461098443,1461098443,1461098444,1461098445],
'lat': [41.1790265,41.1791703,41.1791464,41.1791703,41.1791419],
'lon': [-8.5951883,-8.5951229,-8.5951376,-8.5951229,-8.5951365]
}
df = pd.DataFrame(data)
df
utime lat lon
0 1461098442 41.179026 -8.595188
1 1461098443 41.179170 -8.595123
2 1461098443 41.179146 -8.595138
3 1461098444 41.179170 -8.595123
4 1461098445 41.179142 -8.595137
Two samples are received at same time (unix epoch 1461098443
) so I want retain 1, and delete the other.同时收到两个样本(unix epoch
1461098443
),所以我想保留 1,删除另一个。
So that I have所以我有
utime lat lon
0 1461098442 41.179026 -8.595188
1 1461098443 41.179170 -8.595123
3 1461098444 41.179170 -8.595123
4 1461098445 41.179142 -8.595137
drop_duplicates should help (read https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html ) drop_duplicates 应该有所帮助(阅读https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.ZFC35FDC70D5FC69D269883A822C7A )
df.drop_duplicates(subset='utime')
df = df.groupby('utime', as_index=False).agg('first')
utime lat lon
0 1461098442 41.179026 -8.595188
1 1461098443 41.179170 -8.595123
2 1461098444 41.179170 -8.595123
3 1461098445 41.179142 -8.595137
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.