简体   繁体   English

rpy2(版本2.3.10) - 将数据从R包导入python

[英]rpy2 (version 2.3.10) - importing data from R package into python

So I am trying to import some data from an R package into python in order to test some other python-rpy2 functions that I have written. 所以我试图将一些数据从R包导入到python中,以便测试我编写的其他一些python-rpy2函数。 In particular, I am using the SpatialEpi package in R and the pennLC dataset. 特别是,我使用R中的SpatialEpi包和pennLC数据集。

So I was able to import the rpy2 package and connect to the package correctly. 所以我能够导入rpy2包并正确连接到包。 However, I am not sure how to access the data in the package. 但是,我不确定如何访问包中的数据。

import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
spep = importr("SpatialEpi")

However, I can't seem to access the data object pennLC in the SpatialEpi package to test the function. 但是,我似乎无法访问SpatialEpi包中的数据对象pennLC来测试该函数。 The equivalent R command would be: 等效的R命令是:

data(pennLC)

Any suggestions. 有什么建议。

In R, doing data("foo") can create an arbitrary number of objects in the workspace. 在R中,执行data("foo")可以在工作空间中创建任意数量的对象。 In rpy2 things are contained in an environment. rpy2事物包含在环境中。 This is making it cleaner. 这使它更清洁。

from rpy2.robjects.packages import importr, data
spep = importr("SpatialEpi")
pennLC_data = data(spep).fetch('pennLC')

pennLC_data is an Environment (think of it as a namespace). pennLC_data是一个Environment (将其视为命名空间)。

To list what was fetched: 列出提取的内容:

pennLC_data.keys()

To get the data object wanted: 要获取所需的数据对象:

pennLC_data['pennLC'] # guessing here, it might be a different name

So I figured out an answer based upon some guidance from Laurent's message above. 所以我根据Laurent上面的消息给出了一些答案。

I am using rpy2 version 2.3.10, so that introduces some differences from Laurent's code above. 我使用的是rpy2版本2.3.10,因此引入了上面Laurent代码的一些差异。 Here is what I did. 这就是我做的。

import rpy2.objects as robj
from rpy2.robjects.packages import importr
spep = importr('SpatialEpi', data = True)
data = spep.__rdata__.fetch('pennLC')

First note that there is no .data method in rpy2 2.3.10--the name might have changed. 首先请注意rpy2 2.3.10中没有.data方法 - 名称可能已更改。 But instead, the 2.3.10 documentation indicates that using the data=True argument in the importr will place an PackageData object under .Package.__rdata__ . So I can do a 但相反,2.3.10文档表明在importr中使用data=True参数会将PackageData对象放在.Package.__rdata__ . So I can do a .Package.__rdata__ . So I can do a fetch on the rdata ` object. .Package.__rdata__ . So I can do a on the rdata`对象.Package.__rdata__ . So I can do a提取。

Then when I want to access the data, I can use the following code. 然后,当我想访问数据时,我可以使用以下代码。

data['pennLC'][1]

In [43]: type(d['pennLC'][1])
Out[43]: rpy2.robjects.vectors.DataFrame

To view the data: 要查看数据:

print(data['pennLC'][1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM