简体   繁体   中英

Changing Column Type in Pandas DataFrame to int64

I am trying to change a column's data type from type: object to type: int64 within a DataFrame using .map() .

   df['one'] = df['one'].map(convert_to_int_with_error)

Here is my function:

def convert_to_int_with_error(x):
    if not x in ['', None, ' ']:
        try:
            return np.int64(x)
        except ValueError as e:
            print(e)
            return None
    else:
        return None

    if not type(x) == np.int64():
        print("Not int64")
        sys.exit()

This completes successfully. However, when I check the data type after completion, it reverts to type: float :

print("%s is a %s after converting" % (key, df['one'].dtype))

I think problem is your problematic values are converted from None to NaN , so int is cast to float - see docs .

Instead map you can use to_numeric with parameter errors='coerce' for convert problematic values to NaN :

df['one'] = pd.to_numeric(df['one'], errors='coerce')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM