[英]Numbers change after reading Pickle data (python) into R
I have a large dataset with unix epoch dates embedded in lists/dicts currently stored as a pickle file. 我有一个大数据集,其中Unix纪元日期嵌入在当前存储为pickle文件的列表/字典中。 I tried to import the pickle file into R using the reticulate package vis py_load_object() function.
我尝试使用网状软件包vis py_load_object()函数将pickle文件导入R。 Other than, the unix epoch dates (in milliseconds), all else seems fine.
除了unix纪元日期(以毫秒为单位)以外,其他所有内容似乎都不错。
I get very strange integer conversions. 我得到非常奇怪的整数转换。 For example, epoch date of 694137600000 is read as -1647101952 in R. I was wondering if there is an explanation and a work-around.
例如,纪元694137600000在R中读为-1647101952。我想知道是否有解释和解决方法。
Thanks! 谢谢!
It is very hard to help you without a minimal reproducible example but here are some ideas: 没有最小的可复制示例,很难为您提供帮助,但以下是一些建议:
pandas
data frame inside your Python script. pandas
数据框。 The source_python
function from reticulate
will import it as an R data frame. reticulate
的source_python
函数会将其作为R数据帧导入。 Please refer to the documentation for additional information on type conversions: rstudio/reticulate csv
using Python and then import it into R. This way, you can bypass reticulate
, which is not always an efficient option. csv
,然后将其导入R。这样,您就可以绕过reticulate
,这并不总是一种有效的选择。 Please also note that you may need some help when it comes to handle 13-digit numbers in R. The package bit64
would be of interest to you. 另请注意,在处理R中的13位数字时,您可能需要一些帮助
bit64
软件包将对您很感兴趣。
The problem is that the values are being treated as 32 bit integers by reticulate - you can see the problem with the python snippet below: 问题是网状结构将值视为32位整数-您可以在下面的python代码段中看到问题:
In [1]: v = 694137600000
In [2]: v.bit_length()
Out[2]: 40
In [3]: import ctypes
In [4]: ctypes.c_int(v)
Out[4]: c_long(-1647101952)
In [5]: _.value
Out[5]: -1647101952
In [6]: ctypes.c_int64(v)
Out[6]: c_longlong(694137600000)
In [7]: ctypes.c_int32(v)
Out[7]: c_long(-1647101952)
One of the easiest workarounds is to, in python, unpickle your file and save as a .csv file but you should find that if you convert the pickled data to a pandas data frame and then access it from R it will be converted to an R dataframe - unless the date/time is the first column, (see here for why). 最简单的解决方法之一是在python中解开文件并将其另存为.csv文件,但是您应该发现,如果将腌制的数据转换为熊猫数据框,然后从R访问,它将被转换为R数据框-除非日期/时间是第一列,否则(请参见此处以了解原因)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.