[英]Pandas remove rows only where NaN and float 0.0
I have a dataframe where each column represents a user. 我有一个数据框,其中每一列代表一个用户。 I am trying to eliminate a any column that has nothing but NaN's and 0.000000's.
我试图消除任何列,只有NaN和0.000000。 So that the that username 1 or the first column would not be included in the dataframe, but the others would be.
这样,该用户名1或第一列就不会包含在数据框中,而其他用户名会包含在数据框中。
This is the dataframe: 这是数据框:
username 1 2 3 4 5
date
2019-01-16 NaN 9.16667 NaN NaN 1.000000
2019-01-17 NaN NaN NaN 1.000000 1.000000
2019-01-18 NaN 1.00000 0.956522 1.000000 1.000000
2019-01-19 0.000000 NaN 1.000000 NaN NaN
2019-01-20 0.000000 NaN 0.961538 NaN NaN
The Percentages are stored as float64: 百分比存储为float64:
type(df['1'].iloc[0])
numpy.float64
You could start by replacing 0
by NaN
and then drop columns which contain only NaNs
: 您可以先用
NaN
替换0
,然后删除仅包含NaNs
列:
df.loc[:,~df.replace(0,np.nan).isna().all()]
username 2 3 4 5
0 date NaN NaN NaN NaN
1 2019-01-16 9.16667 NaN NaN 1.0
2 2019-01-17 NaN NaN 1.0 1.0
3 2019-01-18 1.00000 0.956522 1.0 1.0
4 2019-01-19 NaN 1.000000 NaN NaN
5 2019-01-20 NaN 0.961538 NaN NaN
You can first convert 0
values to NaN
via mask
and then dropna
: 您可以先通过
mask
将0
值转换为NaN
,然后将dropna
:
df = df.mask(df.eq(0)).dropna(how='all', axis=1)
This does convert 0
values to NaN
even for non-deleted columns. 即使对于未删除的列,也不会将
0
值转换为NaN
。 It's not clear whether this is what you want, but probably advisable for consistency. 尚不清楚这是否是您想要的,但建议保持一致性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.