简体   繁体   中英

Why does a nan of type <class 'numpy.float64'> return -9223372036854775808 as an int64?

I came across some behavior that I find odd and replicated it. simply, why does

np.int64(np.float64(np.nan))

output

-9223372036854775808

(as pointed out in comments, yes this is -2^63, the maximum negative value for a two-sided int64)

In case it is relevant or of interest, my original use case was looking at dataframe indices of type np.float64 and converting to np.int64 (I don't normally nest types for no reason as in the simplified example above). Starting with an example dataframe:

    0   1
NaN 1   2
1.0 3   4
NaN 5   6

then running:

print(df.index.values[0])
print(type(df.index.values[0]))
print(df.index.values[0].astype(np.int64))
print(type(df.index.values[0].astype(np.int64)))

prints:

nan
<class 'numpy.float64'>
-9223372036854775808
<class 'numpy.int64'>

However, using python types you can't convert float nan to int:

print(np.nan)
print(type(np.nan))
print(np.nan.astype(np.int64))

out:

nan
<class 'float'>
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-130-0d779433eac7> in <module>
      1 print(np.nan)
      2 print(type(np.nan))
----> 3 print(np.nan.astype(np.int64))

AttributeError: 'float' object has no attribute 'astype'

Although in practice I was able to just change the nans to a value I knew would not be a key (0) - I was curious why do class np.float64 types behave this way?

Your df.index.values is a numpy array:

Out[34]: array([nan,  1., inf])
In [35]: a.dtype
Out[35]: dtype('float64')

Arrays have a astype method, and the developers chose to convert special floats like nan to some sort of integer (or as discussed allow the compiler/processor do it). The alternative would have been to raise an error.

In [36]: b=a.astype(int)
In [37]: b
Out[37]: array([-9223372036854775808,                    1, -9223372036854775808])
In [38]: b.dtype
Out[38]: dtype('int64')

np.int32 , np.uint16 etc produce different values.

An object created with np.float64 function is a lot like a 0d array - it has many of the same attributes and methods, including astype :

In [39]: np.float64(np.nan)
Out[39]: nan
In [40]: np.array(np.nan)
Out[40]: array(nan)
In [41]: Out[39].astype(int)
Out[41]: -9223372036854775808
In [42]: Out[40].astype(int)
Out[42]: array(-9223372036854775808)

np.nan on the other hand is a Python float object, and does not have a astype method.

And the python int doesn't like to do it either:

In [52]: int(np.nan)
Traceback (most recent call last):
  File "<ipython-input-52-03e21f51ddd3>", line 1, in <module>
    int(np.nan)
ValueError: cannot convert float NaN to integer

astype() is a Pandas function. When you work with np.nan, you cannot use Pandas functions. rather use int(np.nan)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM