I can't for the life of me figure out why something that used to be so simple no longer works. Often I might map a dataframe column to a dictionary and have some null values appear as they aren't found in the dictionary keys. So the resulting column will be floats + null. Typically I convert .astype("Int64")
and boob, the non-nulls are now ints and not floats, with everything else untouched.
Now I'm running into issues where, I treat my data, use Int64 conversions, acceptance tests are passed, yet later down the road in the pipeline data deployment fails because floats are found in these columns.
Just to make sure I'm not insane, I open jupyter notebook, and initializew a basic dataframe, map it to a dictionary for which some dataframe values don't exist in dictionary keys, then cast as "Int64".....and I still gert this issue?. What's going on. I'm sure this used to be so simple....
df = pd.DataFrame({"keys": [5, 10, 15, 20]})
df["after_mapping"] = df["keys"].map({1: 0, 2: 2, 5: 25, 15: 305})
df["after_mapping"] = df["after_mapping"].astype("Int64")
ValueError: Cannot convert non-finite values (NA or inf) to integer
Works fine on my machine:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 keys 4 non-null int64
1 after_mapping 2 non-null Int64
dtypes: Int64(1), int64(1)
memory usage: 196.0 bytes
Are you sure about your version number?
pd.__version__
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.