r

I'm trying to wrap my head around how to use lapply to recode several variables while pasting in the last value of the variable name into the string.

Building on this post , I know that I can recode several variables at a time:

d2 <- lapply(d1, FUN=function(X) recode(X, "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

But, what I need to do is slightly different. Suppose that my data frame has sequentially labeled variables, eg var_1 , var_2 , var_3 and looks like this:

    var_1              var_2               var_3                   var_4
1:                                                                                
2: Somewhat interested Somewhat interested Somewhat interested      Not interested
3: Somewhat interested Somewhat interested Somewhat interested      Not interested
4:      Not interested Somewhat interested Somewhat interested Somewhat interested

I want to recode the variable and append the sequential identifier of the column name:

           var_1              var_2              var_3             var_4
1:                                                                            
2:         Somewhat 1         Somewhat 2         Somewhat 3               No 4
3:         Somewhat 1         Somewhat 2         Somewhat 3               No 4
4:               No 1         Somewhat 2         Somewhat 3         Somewhat 4

Thoughts on how to combine recode and paste together?

You can use the column names themselves for the sapply() (instead of lapply() - i had to remake the data by hand so this works with the version i have).

So

d2 <- lapply(d1, FUN=function(X) recode(X, "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

turns into

d2 <- sapply(colnames(d1), FUN=function(X) recode(d1[,X], "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

where d1[,X] is calling the column to apply the function to.

now to append the column suffix we can use paste0()

"'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'" 

is replaced by

paste0("'Somewhat interested' ='Somewhat ",X ,"'; 'Not interested' = 'No ", X,"'")

however this stil doesnt do exactly what you want since you will have the suffix and the prefix.

This means we need to then remove the prefix and we can use substr() for that.

substr(X, 5, nchar(X))

all together now:

d2 <- sapply(colnames(d1), FUN=function(X) recode(d1[,X],  paste0("'Somewhat interested' ='Somewhat ",substr(X, 5, nchar(X)) ,"'; 'Not interested' = 'No ", substr(X, 5, nchar(X)),"'")))

You can just use regex:

mtx1 <- sapply(seq_along(df), function(x){gsub('interested', x, df[,x])})
mtx1
#      [,1]         [,2]         [,3]         [,4]        
# [1,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "Not 4"     
# [2,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "Not 4"     
# [3,] "Not 1"      "Somewhat 2" "Somewhat 3" "Somewhat 4"

Admittedly it leaves "Not" instead of "No", but you can either use more complicated regex, or just change it separately:

apply(mtx1, 2, function(x){gsub('Not', 'No', x)})
#      [,1]         [,2]         [,3]         [,4]        
# [1,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "No 4"      
# [2,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "No 4"      
# [3,] "No 1"       "Somewhat 2" "Somewhat 3" "Somewhat 4"

Wrap with as.data.frame (or your favorite version) if you need data.frames instead of matrices.

Note that if you data is in factors, it will be more efficient to run the same regex on the levels instead of the actual data.

暂无
暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question index number of nested lapply Loop to recode variables using dplyr mutate recode using seq_along, paste to recode in a loop Using recode() with variable names generated through paste() using lapply() with multiple variables Using R to recode numeric variables R Recode variables using lag How to recode data using dplyr::recode when variables have a space Using recode() with mutate() and across() to recode multiple variables in r Recode a range of values into one number using 'recode()' in tidyverse
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM