简体   繁体   中英

Creating R's formula using Python

I am writing a program that interacts with R using Python. Basically, I have some R libraries that I want to ingest into my Python code. After downloading rpy2 , I define my R functions that I want to use in a separate .R file script.

The R function requires that we pass the formula to it for applying some oversampling technique. Below is the R function that I wrote:

WFRandUnder <- function(target_variable, other, train, rel, thr.rel, C.perc, repl){
    a <- target_variable
    b <- '~'
    form_begin <- paste(a, b, sep=' ')
    fmla <- as.formula(paste(form_begin, paste(other, collapse= "+")))
    undersampled = RandUnderRegress(fmla, train, rel, thr.rel, C.perc, repl)
    return(undersampled)
}

I am passing, from python, the target variable name, as well as a list containing all the other columns' names. As I want it to be as follows: my_target_variable ~ all other columns

However in these line:

a <- target_variable
    b <- '~'
    form_begin <- paste(a, b, sep=' ')
    fmla <- as.formula(paste(form_begin, paste(other, collapse= "+"))) 

The formula does not always get formulated if I have many columns in my data. What should I do to make it always work? I am concatenating all columns'names with a + operator.

Thanks to @nicola, I was able to solve this problem by doing the following:

create_formula <- function(target_variable, other){
    # y <- target_variable
    # tilda <- '~'
    # form_begin <- paste(y, tilda, sep=' ')
    # fmla <- as.formula(paste(form_begin, paste(other, collapse= "+")))
    # return(fmla)
    y <- target_variable
    fmla = as.formula(paste(y, '~ .'))
    return(fmla)
}

I call this function from my python program using rpy2 . This issues no problem because whenever we use this formula, we will be attaching the data itself to it, so it won't possess a problem. A sample code to demonstrate what I'm saying:

        if self.smogn:
            smogned = runit.WFDIBS(

                 # here is the formula call (get_formula is a python function that calls create_formula defined above in R)
                fmla=get_formula(self.target_variable, self.other),

                # here is the data 
                dat=df_combined,

                method=self.phi_params['method'][0],
                npts=self.phi_params['npts'][0],
                controlpts=self.phi_params['control.pts'],
                thrrel=self.thr_rel,
                Cperc=self.Cperc,
                k=self.k,
                repl=self.repl,
                dist=self.dist,
                p=self.p,
                pert=self.pert)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM