简体   繁体   中英

casting as “Int64” nullable integer type no longer seems to work

I can't for the life of me figure out why something that used to be so simple no longer works. Often I might map a dataframe column to a dictionary and have some null values appear as they aren't found in the dictionary keys. So the resulting column will be floats + null. Typically I convert .astype("Int64") and boob, the non-nulls are now ints and not floats, with everything else untouched.

Now I'm running into issues where, I treat my data, use Int64 conversions, acceptance tests are passed, yet later down the road in the pipeline data deployment fails because floats are found in these columns.

Just to make sure I'm not insane, I open jupyter notebook, and initializew a basic dataframe, map it to a dictionary for which some dataframe values don't exist in dictionary keys, then cast as "Int64".....and I still gert this issue?. What's going on. I'm sure this used to be so simple....

df = pd.DataFrame({"keys": [5, 10, 15, 20]})

df["after_mapping"] = df["keys"].map({1: 0, 2: 2, 5: 25, 15: 305})

df["after_mapping"] = df["after_mapping"].astype("Int64")

ValueError: Cannot convert non-finite values (NA or inf) to integer

Works fine on my machine:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
 #   Column         Non-Null Count  Dtype
---  ------         --------------  -----
 0   keys           4 non-null      int64
 1   after_mapping  2 non-null      Int64
dtypes: Int64(1), int64(1)
memory usage: 196.0 bytes

Are you sure about your version number?

pd.__version__

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM