简体   繁体   English

Python:如何将大数存储在 Pandas dataframe 作为 int64 或 float64?

[英]Python: How to store large numbers in a Pandas dataframe as int64 or float64?

Similar to this question, Storing big numbers over 9000 digits in Python , I'm interested in storing large digits in a Pandas data frame where the dtype is int64 or float64 but not object.与这个问题类似, Storing large numbers over 9000 digits in Python ,我有兴趣将大数字存储在 Pandas 数据帧中,其中dtypeint64float64但不是 ZA8CFFDE6331BD59B66664C66Z。 I have tried but I keep getting this error when I either initialize the dataframe or when I cast to int64 or float64 : OverflowError: int too large to convert to float .我已经尝试过,但是当我初始化 dataframe 或当我转换为int64float64时,我不断收到此错误: OverflowError: int too large to convert to float

Here is some sample code that raises an error:以下是一些引发错误的示例代码:

data = [[4041067959774462618542251414053149763363284932506803841495981726909361589243016772093539952215008166854586458807896667612935940650616044271694578570770218354465319095565165551049760172710391683826002499005236096882016133967285292291606248423125012884140175919816849209382612886503119619750800600507246127268611380063066868139796774976684606993289391743637218529185641004454047725507720821393787669169611972814982330545723200072965546061194948505665350431588541107227045045135059495789131566496560507159916524037246652355679704655191235607257759392890459293292994869676442294348205840960197717998950931099935125824565443461965027936602550759188464075684122645652374411071687652948467619565381434911645676757024253483187841007912001722045733971195432548620690744725086837979031567344095323422174671522835282126126173748501439121944882602887928671532521816234961981946544118773557395130950306137831226533275921950157923776845085190156444450216692581322726107832236483226314003339464513548213142271415371910246088829012370639200542888385733241823213915919885883384151357374501359157931301139416090907994970949429195483607826525457136853740508614341446335314912887887891647364907817033609726890368372485038664354107037004105702300397408085198993506316238517085901918870189631204393632008524269869979074462426748217010716364884706958521730228474069227641283826703864839419845872269299777537, 10], [2, 15], [3, 14]]

  
df = pd.DataFrame(data, columns = ['code1', 'code2'])

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
...
OverflowError: int too large to convert to float

It's clear you can cast the array to np.array but note that this forces the dtype to be object and not an int64 or float64 .很明显,您可以将数组转换为 np.array 但请注意,这会强制dtype为 object 而不是int64float64

Use data in array format by using numpy.asarray() method.使用numpy.asarray()方法以数组格式使用数据。 below is solution code for this question,下面是这个问题的解决方案代码,

import numpy as np 
import pandas as pd 

data = [[4041067959774462618542251414053149763363284932506803841495981726909361589243016772093539952215008166854586458807896667612935940650616044271694578570770218354465319095565165551049760172710391683826002499005236096882016133967285292291606248423125012884140175919816849209382612886503119619750800600507246127268611380063066868139796774976684606993289391743637218529185641004454047725507720821393787669169611972814982330545723200072965546061194948505665350431588541107227045045135059495789131566496560507159916524037246652355679704655191235607257759392890459293292994869676442294348205840960197717998950931099935125824565443461965027936602550759188464075684122645652374411071687652948467619565381434911645676757024253483187841007912001722045733971195432548620690744725086837979031567344095323422174671522835282126126173748501439121944882602887928671532521816234961981946544118773557395130950306137831226533275921950157923776845085190156444450216692581322726107832236483226314003339464513548213142271415371910246088829012370639200542888385733241823213915919885883384151357374501359157931301139416090907994970949429195483607826525457136853740508614341446335314912887887891647364907817033609726890368372485038664354107037004105702300397408085198993506316238517085901918870189631204393632008524269869979074462426748217010716364884706958521730228474069227641283826703864839419845872269299777537, 10], [2, 15], [3, 14]]

data_array = np.asarray(data) 

df = pd.DataFrame(
    data_array, 
    columns=['code1', 'code2']
    )

print(df['code1'][0])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM