错误：在pandas中无法将float NaN转换为整数

Question

I have the dataframe: 我有数据帧：

   a            b     c      d
0 nan           Y     nan   nan
1  1.27838e+06  N      3     96
2 nan           N      2    nan
3  284633       Y     nan    44

I try to change the data which is non zero to interger type to avoid exponential data(1.27838e+06): 我尝试将非零的数据更改为整数类型以避免指数数据（1.27838e + 06）：

f=lambda x : int(x)
df['a']=np.where(df['a']==None,np.nan,df['a'].apply(f))

But I get error also event thought I wish to change the dtype of not null value, anyone can point out my error? 但我得到错误也事件认为我希望更改非null值的dtype，任何人都可以指出我的错误？ thanks 谢谢

Answer 1

Pandas doesn't have the ability to store NaN values for integers . Pandas无法存储整数的NaN值。 Strictly speaking, you could have a column with mixed data types, but this can be computationally inefficient. 严格地说，您可以使用具有混合数据类型的列，但这可能在计算上效率低下。 So if you insist, you can do 所以，如果你坚持，你可以做到

df['a'] = df['a'].astype('O')
df.loc[df['a'].notnull(), 'a'] = df.loc[df['a'].notnull(), 'a'].astype(int)

Answer 2

As far as I have read in the pandas documentation , it is not possible to represent an integer NaN : 据我在pandas文档中读到，无法表示整数NaN ：

"In the absence of high performance NA support being built into NumPy from the ground up, the primary casualty is the ability to represent NAs in integer arrays." “由于没有从头开始构建NumPy的高性能NA支持，主要的伤亡是能够在整数数组中表示NA。”

As it is explained later, it is due to memory and performance reasons, and also so that the resulting Series continues to be “numeric”. 正如后面所解释的那样，这是由于内存和性能原因，以及最终的系列仍然是“数字”。 One possibility is to use dtype=object arrays instead. 一种可能性是使用dtype=object数组。

错误：在pandas中无法将float NaN转换为整数

问题描述

2 个解决方案

解决方案1
2 2017-07-04 03:34:08

解决方案2
1 2017-07-04 03:31:44

错误：在pandas中无法将float NaN转换为整数

问题描述

2 个解决方案

解决方案1 2 2017-07-04 03:34:08

解决方案2 1 2017-07-04 03:31:44

解决方案1
2 2017-07-04 03:34:08

解决方案2
1 2017-07-04 03:31:44