简体   繁体   中英

Loop over multiple variables with same name but different suffix

I have a quite complex set of functions that I need to apply to four different dummy variables with same core name but different number at the end. I am looking to apply these functions in one go rather than repeating it four times.

As an example, here's a made up dataset just for illustrative purposes:

n <- c(1:100) 
var1 <-NA
var1[n < 20] <- 1
var1[n >50] <- 0
var2 <-NA
var2[n < 30] <- 1
var2[n >50] <- 0
var3 <-NA
var3[n < 10] <- 1
var3[n >40] <- 0
var4 <-NA
var4[n < 20] <- 1
var4[n > 450] <- 0
df <- data.frame(var1, var2, var3, var4, n)

In terms of the functions I need to loop over, it's mainly three with regards to these variables. I need to be able to first subset the dataframe, create a new variable for each of the original ones, and write the new results into a dataframe. Please don't ask me why I need to do these, they're a part of a much larger code.

These are the steps I need to perform but on all 4:

df_sub <- subset(df, !is.na(df$var1))

sample1 <- nrow(df_sub[df_sub$var1 == 1,])

if(sample1 < 35) {
a1 <- NA
} else {
a1 <- mean(df_sub$n[df_sub$var1==1])

new_df <- data.frame(a1,a2,a3,a4)

I was thinking of looping over the suffix but I cannot figure out how R deals with this. I found a solution for creating a variable in a loop through assign() ( https://stats.stackexchange.com/questions/10838/produce-a-list-of-variable-name-in-a-for-loop-then-assign-values-to-them ) But I still cannot figure out how to deal with the subset. And more generally, how I would go about looping over a number in variable name rather than column number, list, etc.

Alternatively if there is a way to create a function in which I can actually create variables to export into environment outside of this function and then apply the function to var1 - var4 in df and still get 4 different versions of a (a1 - a4) in a new_df.

You can start the loop and update the variable over which you work by using get() and then use assign() . As an example:

for (i in 1:number_of_variables){
     variable=get(paste0("var",i))

     ... work on the variable ...
    # Returns
    assign(paste0("df_sub",i), ... your result ...)
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM