简体   繁体   中英

Error:cannot convert float NaN to integer in pandas

I have the dataframe:

   a            b     c      d
0 nan           Y     nan   nan
1  1.27838e+06  N      3     96
2 nan           N      2    nan
3  284633       Y     nan    44

I try to change the data which is non zero to interger type to avoid exponential data(1.27838e+06):

f=lambda x : int(x)
df['a']=np.where(df['a']==None,np.nan,df['a'].apply(f))

But I get error also event thought I wish to change the dtype of not null value, anyone can point out my error? thanks

Pandas doesn't have the ability to store NaN values for integers . Strictly speaking, you could have a column with mixed data types, but this can be computationally inefficient. So if you insist, you can do

df['a'] = df['a'].astype('O')
df.loc[df['a'].notnull(), 'a'] = df.loc[df['a'].notnull(), 'a'].astype(int)

As far as I have read in the pandas documentation , it is not possible to represent an integer NaN :

"In the absence of high performance NA support being built into NumPy from the ground up, the primary casualty is the ability to represent NAs in integer arrays."

As it is explained later, it is due to memory and performance reasons, and also so that the resulting Series continues to be “numeric”. One possibility is to use dtype=object arrays instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM