简体   繁体   中英

Cast to numeric data of length 20 in dataframe

My csv file has few numeric data of length 20. When I read it in dataframes, it is read as dtype object. I need to cast all numeric data to Integer.

My data is csv looks like :

emp_id,age,salary,marital
21012334509821345944,22,4500,married
21012334509821345945,22,4510,single
21012334509821345946,22,45040,married
21012334509821345947,22,41500,single
21012334509821345948,22,54500,single
21012334509821345949,22,64500,married

I tried :

d1 = pd.read_csv('D:\\Exercise\\test.csv')
d1.set_index('emp_id',inplace = True)
d1.index = d1.index.map(int) #OverflowError: int too big to convert
print(d1.index.values)

If I comment the index map , I get output like : ['21012334509821345944' '21012334509821345945' '21012334509821345946' '21012334509821345947' '21012334509821345948' '21012334509821345949']

But I need integers. I tried casting the first column alone. Is it possible to cast all the data in dataframe if it has numeric value. I tried with casting numpy.I face the same error. Thanks.

可以由整数(np.uint64)表示的最大值是18446744073709551615。因此,可能您将无法做到这一点。

Pandas/Numpy keep integers to 64 bits. Maybe greater, but point is there is a limit. You need to store them as dtype object but have the values as int .

This is one way:

df.emp_id.values[:] = [*map(int, df.emp_id)]

Then you can do math.

df.emp_id // int(1e10)

0    2101233450
1    2101233450
2    2101233450
3    2101233450
4    2101233450
5    2101233450
Name: emp_id, dtype: object

It won't be optimized math, but it should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM