[英]rpy2 (version 2.3.10) - importing data from R package into python
So I am trying to import some data from an R package into python in order to test some other python-rpy2 functions that I have written. 所以我试图将一些数据从R包导入到python中,以便测试我编写的其他一些python-rpy2函数。 In particular, I am using the SpatialEpi
package in R and the pennLC
dataset. 特别是,我使用R中的SpatialEpi
包和pennLC
数据集。
So I was able to import the rpy2 package and connect to the package correctly. 所以我能够导入rpy2包并正确连接到包。 However, I am not sure how to access the data in the package. 但是,我不确定如何访问包中的数据。
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
spep = importr("SpatialEpi")
However, I can't seem to access the data object pennLC
in the SpatialEpi
package to test the function. 但是,我似乎无法访问SpatialEpi
包中的数据对象pennLC
来测试该函数。 The equivalent R command would be: 等效的R命令是:
data(pennLC)
Any suggestions. 有什么建议。
In R, doing data("foo")
can create an arbitrary number of objects in the workspace. 在R中,执行data("foo")
可以在工作空间中创建任意数量的对象。 In rpy2
things are contained in an environment. 在rpy2
事物包含在环境中。 This is making it cleaner. 这使它更清洁。
from rpy2.robjects.packages import importr, data
spep = importr("SpatialEpi")
pennLC_data = data(spep).fetch('pennLC')
pennLC_data
is an Environment
(think of it as a namespace). pennLC_data
是一个Environment
(将其视为命名空间)。
To list what was fetched: 列出提取的内容:
pennLC_data.keys()
To get the data object wanted: 要获取所需的数据对象:
pennLC_data['pennLC'] # guessing here, it might be a different name
So I figured out an answer based upon some guidance from Laurent's message above. 所以我根据Laurent上面的消息给出了一些答案。
I am using rpy2 version 2.3.10, so that introduces some differences from Laurent's code above. 我使用的是rpy2版本2.3.10,因此引入了上面Laurent代码的一些差异。 Here is what I did. 这就是我做的。
import rpy2.objects as robj
from rpy2.robjects.packages import importr
spep = importr('SpatialEpi', data = True)
data = spep.__rdata__.fetch('pennLC')
First note that there is no .data
method in rpy2 2.3.10--the name might have changed. 首先请注意rpy2 2.3.10中没有.data
方法 - 名称可能已更改。 But instead, the 2.3.10 documentation indicates that using the data=True
argument in the importr
will place an PackageData
object under .Package.__rdata__ . So I can do a
但相反,2.3.10文档表明在importr
中使用data=True
参数会将PackageData
对象放在.Package.__rdata__ . So I can do a
.Package.__rdata__ . So I can do a
fetch on the
rdata ` object. .Package.__rdata__ . So I can do a
on the
rdata`对象.Package.__rdata__ . So I can do a
提取。
Then when I want to access the data, I can use the following code. 然后,当我想访问数据时,我可以使用以下代码。
data['pennLC'][1]
In [43]: type(d['pennLC'][1])
Out[43]: rpy2.robjects.vectors.DataFrame
To view the data: 要查看数据:
print(data['pennLC'][1])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.