将Pickle数据（python）读入R后数字发生变化

Question

I have a large dataset with unix epoch dates embedded in lists/dicts currently stored as a pickle file. 我有一个大数据集，其中Unix纪元日期嵌入在当前存储为pickle文件的列表/字典中。 I tried to import the pickle file into R using the reticulate package vis py_load_object() function. 我尝试使用网状软件包vis py_load_object（）函数将pickle文件导入R。 Other than, the unix epoch dates (in milliseconds), all else seems fine. 除了unix纪元日期（以毫秒为单位）以外，其他所有内容似乎都不错。

I get very strange integer conversions. 我得到非常奇怪的整数转换。 For example, epoch date of 694137600000 is read as -1647101952 in R. I was wondering if there is an explanation and a work-around. 例如，纪元694137600000在R中读为-1647101952。我想知道是否有解释和解决方法。

Thanks! 谢谢！

Answer 1

It is very hard to help you without a minimal reproducible example but here are some ideas: 没有最小的可复制示例，很难为您提供帮助，但以下是一些建议：

You can un-pickle and convert the file to pandas data frame inside your Python script. 您可以解刺并将文件转换为Python脚本中的pandas数据框。 The source_python function from reticulate will import it as an R data frame. 来自reticulate的source_python函数会将其作为R数据帧导入。 Please refer to the documentation for additional information on type conversions: rstudio/reticulate 请参阅文档以获取有关类型转换的其他信息： rstudio / reticulate
It is always possible to un-pickle the file and export as a common format such as csv using Python and then import it into R. This way, you can bypass reticulate , which is not always an efficient option. 总是有可能解开文件并使用Python将其导出为通用格式（例如csv ，然后将其导入R。这样，您就可以绕过reticulate ，这并不总是一种有效的选择。

Please also note that you may need some help when it comes to handle 13-digit numbers in R. The package bit64 would be of interest to you. 另请注意，在处理R中的13位数字时，您可能需要一些帮助bit64软件包将对您很感兴趣。

Answer 2

The problem is that the values are being treated as 32 bit integers by reticulate - you can see the problem with the python snippet below: 问题是网状结构将值视为32位整数-您可以在下面的python代码段中看到问题：

In [1]: v = 694137600000

In [2]: v.bit_length()
Out[2]: 40

In [3]: import ctypes

In [4]: ctypes.c_int(v)
Out[4]: c_long(-1647101952)

In [5]: _.value
Out[5]: -1647101952

In [6]: ctypes.c_int64(v)
Out[6]: c_longlong(694137600000)

In [7]: ctypes.c_int32(v)
Out[7]: c_long(-1647101952)

One of the easiest workarounds is to, in python, unpickle your file and save as a .csv file but you should find that if you convert the pickled data to a pandas data frame and then access it from R it will be converted to an R dataframe - unless the date/time is the first column, (see here for why). 最简单的解决方法之一是在python中解开文件并将其另存为.csv文件，但是您应该发现，如果将腌制的数据转换为熊猫数据框，然后从R访问，它将被转换为R数据框-除非日期/时间是第一列，否则（请参见此处以了解原因）。

将Pickle数据（python）读入R后数字发生变化

问题描述

2 个解决方案

解决方案1
1 2018-08-04 06:25:02

解决方案2
0 2018-08-04 06:43:32

将Pickle数据（python）读入R后数字发生变化

问题描述

2 个解决方案

解决方案1 1 2018-08-04 06:25:02

解决方案2 0 2018-08-04 06:43:32

解决方案1
1 2018-08-04 06:25:02

解决方案2
0 2018-08-04 06:43:32