简体   繁体   English

使用rpy2从Python调用R函数时遇到问题

[英]Trouble calling R function from Python with rpy2

I am trying to use rpy2 to call the R package MatchIt. 我正在尝试使用rpy2来调用R包MatchIt。 I am having difficulty seeing the outcome of the matched pairs from the $match.matrix. 我很难从$ match.matrix看到匹配对的结果。 Here is the R code I am trying to execute in python. 这是我试图在python中执行的R代码。

matched <- cbind(lalonde[row.names(foo$match.matrix),"re78"],lalonde[foo$match.matrix,"re78"])

Here is my python code: 这是我的python代码:

import readline
import rpy2.robjects
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
from rpy2 import robjects as ro

import numpy as np
from scipy.stats import ttest_ind
import pandas as pd
from pandas import Series,DataFrame

pandas2ri.activate()
R = ro.r
MatchIt = importr('MatchIt')
base = importr('base')

df = R('lalonde')
lalonde = pandas2ri.py2ri(df)
formula = 'treat ~ age + educ + black + hispan + married + nodegree + re74 + re75'

foo = MatchIt.matchit(formula = R(formula),
                               data = lalonde,
                               method = R('"nearest"'),
                               ratio = 1)

matched = \
base.cbind(lalonde.rx[base.row_names(foo.rx2('match.matrix')),"re78"], 
       lalonde.rx[foo.rx2('match.matrix'),"re78"])

This chunk runs : 该块运行:

lalonde.rx(base.row_names(foo.rx2('match.matrix')),
       "re78")

but this chunk 但是这块

lalonde.rx[foo.rx2('match.matrix'),"re78"].

returns an error of: 返回以下错误:

ValueError: The first parameter must be a tuple.

The output of 输出

cbind(lalonde[row.names(foo$match.matrix),"re78"], lalonde[foo$match.matrix,"re78"])

should be a dataframe which matches the row names and cell values of foo$match.matrix with the values of "re78" in the lalonde dataframe 应该是一个数据框,该数据框将foo $ match.matrix的行名和单元格值与lalonde数据框中的“ re78”值相匹配

Here lalonde is defined elsewhere (but thanks to @Parfait's question we know that this is a data frame). 在这里lalonde在其他地方定义(但是由于lalonde的问题,我们知道这是一个数据帧)。 Now you'll have to break down your one-liner triggering the error to pinpoint the exact place of trouble (and we can't do that for you - the thing about self-contained and reproducible examples is that they are helping us help you). 现在,您必须分解单一行触发错误,以查明确切的问题所在(而我们不能为您做到这一点–关于独立且可复制的示例,它们可以帮助我们为您提供帮助)。

matched = \
base.cbind(lalonde[base.row_names(foo.rx2('match.matrix')),"re78"], 
           lalonde[foo.rx2('match.matrix'),"re78"])

Is this breaking with the first subset of lalonde ? 这与lalonde的第一个子集一起打破了吗?

lalonde[base.row_names(foo.rx2('match.matrix')),"re78"]

Since type(lalonde) is rpy2.robjects.vectors.DataFrame this is an R/rpy2 data frame. 由于type(lalonde)rpy2.robjects.vectors.DataFrame因此这是R / rpy2数据帧。 Extracting a subset like one would do it in R can be achieved with .rx (as in r -style e x traction - see http://rpy2.readthedocs.io/en/version_2.8.x/vector.html#extracting-r-style ). 提取一个子集就像一个会做的R可以与实现.rx (作为R风格的E X牵引-见http://rpy2.readthedocs.io/en/version_2.8.x/vector.html#extracting -r样式 )。

lalonde.rx(base.row_names(foo.rx2('match.matrix')),
           "re78")

It is important to understand what is happening with this call. 了解此调用正在发生的事情很重要。 By default the elements to extract in each direction of the data structure (here rows and columns of the data frame respectively) must be R vectors (vector of names, or vector of one-offset index integers) or a Python data structure that the conversion mechanism can translate into an R vector (of names or integers). 默认情况下,要在数据结构的每个方向上提取的元素(此处分别为数据帧的行和列)必须是R向量(名称向量或一个偏移索引整数的向量)或Python数据结构,机制可以转换为R向量(名称或整数)。 base.row_names will return the row names (and that's a vector of names) but foo.rx2('match.matrix') might be something else. base.row_names将返回行名称(这是名称的向量),但是foo.rx2('match.matrix')可能是其他名称。

Here type(foo.rx2('match.matrix')) is indicating that this is a matrix. 这里的type(foo.rx2('match.matrix'))表示这是一个矩阵。 Using matrices can be used be used to cherry pick cells in an R array, but in that case there can only be one parameter for the extraction... and we presently have two (the second is "re78" ). 使用矩阵可以用来在R数组中挑选单元格,但是在那种情况下,只能有一个参数用于提取...而我们目前有两个(第二个是"re78" )。

Since the first column of that match.matrix contains the indices (row numbers) in lalonde , the following should be what you want: 从那时起的第一列match.matrix包含在索引(行号) lalonde ,下面应该是你想要的东西:

matched = \
base.cbind(lalonde.rx[base.row_names(foo.rx2('match.matrix')),"re78"], 
           lalonde.rx[foo.rx2('match.matrix').rx(True, 1),"re78"])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM