[英]In a column, fill values that are not a number with “NaN”
I have a DataFrame with a certain column with values as below:我有一个 DataFrame 有一个特定的列,其值如下:
index some_column
0 12345
1 23549
2 .....
3 78516
4 98713
5 .....
I want to check the values in the column and if the value is not a number (ie if the value is "....."), then I want to fill that value with np.NaN.我想检查列中的值,如果值不是数字(即如果值为“.....”),那么我想用 np.NaN 填充该值。
I've tried the function below:我试过下面的 function:
from numbers import Number
def fill_in(values):
if isinstance(values, Number) == False:
return np.NaN
then I use the .apply
function on the column:然后我在列上使用
.apply
function :
df['some_column'].apply(fill_in)
I expected:我期望:
index some_column
0 12345
1 23549
2 NaN
3 78516
4 98713
5 NaN
But instead got:但反而得到:
index some_column
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
Can someone please explain to me why I thought wrong?有人可以向我解释为什么我想错了吗?
Your function supplied to apply
must have a return value for all inputs.提供给
apply
的 function 必须具有所有输入的返回值。 In your case, there is no return value if the if
test fails.在您的情况下,如果
if
测试失败,则没有返回值。
In your case when pandas does not get a value returned from the function, it makes up the output as NaN
since it has nothing to put there.在您的情况下,当 pandas 没有从 function 返回值时,它构成了 output 为
NaN
,因为它没有什么可以放在那里。
Adding that negative test return value should get you the desired output.添加负测试返回值应该可以得到所需的 output。
def fill_in(value):
if isinstance(value, Number) == False:
return np.NaN
else:
return value
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.