简体   繁体   中英

Is it possible to fix 'ValueError: cannot convert float NaN to integer' error without removing the NaN values?

so my problem lies in preparing a DataFrame for creating a heatmap using pandas and seaborn. My question is if there is to keep the NaN values as NaN while converting everything from an object to an integer so I can plot it doing something like sns.heatmap(df, mask = df.isnull())

What I am doing so far is entering data into a new DataFrame that I created that looks like this ( https://imgur.com/a/fEDcnoi ) upon creation.

From there I insert the values into the new DataFrame using code that looks like:

start = 16
end = start + 10
dates = range(start,end)
for d in dates:
    str(d)
    for i, row in jfk10day.iterrows():
        row[f'Apr/{d}/2019'] = jfk[jfk['Pick-up Date'] == f'Apr/{d}/2019'][jfk['Supplier']==i][jfk['Car Type'] == 'Compact']['Total Price'].min()

Which enters the data into the dataframe as type object. This completed dataframe looks like https://imgur.com/3m41KtL .

Now from here I know that I need to change the datatype to int/float in order to plot it using sns.heatmap(), but when I try something like:

jfk10day = jfk10day.astype(int)

I get the error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-76-45dab2567d52> in <module>
----> 1 jfk10day.astype(int)

/anaconda3/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    176                 else:
    177                     kwargs[new_arg_name] = new_arg_value
--> 178             return func(*args, **kwargs)
    179         return wrapper
    180     return _deprecate_kwarg

/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in astype(self, dtype, copy, errors, **kwargs)
   4999             # else, only a single dtype is given
   5000             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5001                                          **kwargs)
   5002             return self._constructor(new_data).__finalize__(self)
   5003 

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in astype(self, dtype, **kwargs)
   3712 
   3713     def astype(self, dtype, **kwargs):
-> 3714         return self.apply('astype', dtype=dtype, **kwargs)
   3715 
   3716     def convert(self, **kwargs):

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
   3579 
   3580             kwargs['mgr'] = self
-> 3581             applied = getattr(b, f)(**kwargs)
   3582             result_blocks = _extend_blocks(applied, result_blocks)
   3583 

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in astype(self, dtype, copy, errors, values, **kwargs)
    573     def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
    574         return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 575                             **kwargs)
    576 
    577     def _astype(self, dtype, copy=False, errors='raise', values=None,

/anaconda3/lib/python3.7/site-packages/pandas/core/internals.py in _astype(self, dtype, copy, errors, values, klass, mgr, **kwargs)
    662 
    663                 # _astype_nansafe works fine with 1-d only
--> 664                 values = astype_nansafe(values.ravel(), dtype, copy=True)
    665                 values = values.reshape(self.shape)
    666 

/anaconda3/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy)
    707         # work around NumPy brokenness, #1987
    708         if np.issubdtype(dtype.type, np.integer):
--> 709             return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
    710 
    711         # if we have a datetime/timedelta array of objects

pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()

pandas/_libs/src/util.pxd in util.set_value_at_unsafe()

ValueError: cannot convert float NaN to integer

So I am wondering if there is a way to edit my for loop so that every entry is entered as an int (the original dataframe 'Total Price' is already int), or if there is a way to convert the new dataframe to type int while skipping over the NaN values. I need the NaN values in the heatmap to show that the supplier is not offering anything on that specific date.

Thanks in advance for the help guys, and if there is any more information needed from me please let me know!

Since pandas version 0.24.0 we have nullable integer data type:

df = pd.DataFrame({'Col':[1.0, 2.0, 3.0, np.NaN]})
print(df)

   Col
0  1.0
1  2.0
2  3.0
3  NaN 

print(df.Col.astype('Int64'))

0      1
1      2
2      3
3    NaN
Name: Col, dtype: Int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM