简体   繁体   中英

Using R packages in Python using rpy2

There is a package in R that I need to use on my data. All my data preprocessing has already been done in python and all the modelling as well. The package in R is 'PMA'. I have used r2py before using Rs PLS package as follows

import numpy as np
from rpy2.robjects.numpy2ri import numpy2ri
import rpy2.robjects as ro

def Rpcr(X_train,Y_train,X_test):
    ro.r('''source('R_pls.R')''')
    r_pls=ro.globalenv['R_pls']
    r_x_train=numpy2ri(X_train)
    r_y_train=numpy2ri(Y_train)
    r_x_test=numpy2ri(X_test)

    p_res=r_pls(r_x_train,r_y_train,r_x_test)
    yp_test=np.array(p_res[0])
    yp_test=yp_test.reshape((yp_test.size,))
    yp_train=np.array(p_res[1])
    yp_train=yp_train.reshape((yp_train.size,))
    ncomps=np.array(p_res[2])
    ncomps=ncomps.reshape((ncomps.size,))

return yp_test,yp_train,ncomps

when I followed this format is gave an error that function numpy2ri does not exist.

So I have been working off of rpy2 manual and have tried a number of things with no success. The package I am working with in R is implemented like so:

library('PMA')
cspa=CCA(X,Z,typex="standard", typez="standard", K=1, penaltyx=0.25, penaltyz=0.25)
# X and Z are dataframes with dimension ppm and pXq
# cspa returns an R object which I need two attributes u and v
U<-cspa$u
V<-cspa$v

So trying to implement something like I was seeing on the rpy2 tried to load the module in python and use it in python like so

import rpy2.robjects as ro
from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage as STAP
from rpy2.robjects import numpy2ri
from rpy2.robjects.packages import importr

base=importr('base'
scca=importr('PMA')
numpy2ri.activate() # To turn NumPy arrays X1 and X2 to r objects
out=scca.CCA(X1,X2,typex="standard",typez="standard", K=1, penaltyz=0.25,penaltyz=0.25)

and got the following error

OMP: Error #15: Initializing libomp.dylib, but found libiomp5.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the  program. That is dangerous, since it can degrade performance or cause incorrect results. The best   thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by   avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/

Abort trap: 6

I also tried using R code directly using an example they had

  string<-'''SCCA<-function(X,Z,K,alpha){
  library("PMA")
  scca<-CCA(X,Z,typex="standard",typez="standard",K=K penaltyx=alpha,penaltyz=alpha)
  u<-scca$u
  v<-scca$v
  out<-list(U=u,V=v)
  return(out)}'''

  scca=STAP(string,"scca")

which as I understand can be used like an r function directly

 numpy2ri.activate()
 scca(X,Z,1,0.25)

this results in the same error as above.

So I do not know exactly how to fix it and have been unable to find anything similar.

The error for some reason is a mac-os issue. https://stackoverflow.com/a/53014308/1628393

Thus all you have to do is modify it with this command and it works well

os.environ['KMP_DUPLICATE_LIB_OK']='True'
string<-'''SCCA<-function(X,Z,K,alpha){
library("PMA")
scca<-CCA(X,Z,typex="standard",typez="standard",K=Kpenaltyx=alpha,penaltyz=alpha)
u<-scca$u
v<-scca$v
out<-list(U=u,V=v)
return(out)}'''

scca=STAP(string,"scca")

then the function is called by

scca.SCCA(X,Z,1,0.25)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM