Question

I'm trying to wrap my head around how to use lapply to recode several variables while pasting in the last value of the variable name into the string.

Building on this post , I know that I can recode several variables at a time:

d2 <- lapply(d1, FUN=function(X) recode(X, "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

But, what I need to do is slightly different. Suppose that my data frame has sequentially labeled variables, eg var_1 , var_2 , var_3 and looks like this:

    var_1              var_2               var_3                   var_4
1:                                                                                
2: Somewhat interested Somewhat interested Somewhat interested      Not interested
3: Somewhat interested Somewhat interested Somewhat interested      Not interested
4:      Not interested Somewhat interested Somewhat interested Somewhat interested

I want to recode the variable and append the sequential identifier of the column name:

           var_1              var_2              var_3             var_4
1:                                                                            
2:         Somewhat 1         Somewhat 2         Somewhat 3               No 4
3:         Somewhat 1         Somewhat 2         Somewhat 3               No 4
4:               No 1         Somewhat 2         Somewhat 3         Somewhat 4

Thoughts on how to combine recode and paste together?

Answer 1

You can use the column names themselves for the sapply() (instead of lapply() - i had to remake the data by hand so this works with the version i have).

So

d2 <- lapply(d1, FUN=function(X) recode(X, "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

turns into

d2 <- sapply(colnames(d1), FUN=function(X) recode(d1[,X], "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

where d1[,X] is calling the column to apply the function to.

now to append the column suffix we can use paste0()

"'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"

is replaced by

paste0("'Somewhat interested' ='Somewhat ",X ,"'; 'Not interested' = 'No ", X,"'")

however this stil doesnt do exactly what you want since you will have the suffix and the prefix.

This means we need to then remove the prefix and we can use substr() for that.

substr(X, 5, nchar(X))

all together now:

d2 <- sapply(colnames(d1), FUN=function(X) recode(d1[,X],  paste0("'Somewhat interested' ='Somewhat ",substr(X, 5, nchar(X)) ,"'; 'Not interested' = 'No ", substr(X, 5, nchar(X)),"'")))

Answer 2

You can just use regex:

mtx1 <- sapply(seq_along(df), function(x){gsub('interested', x, df[,x])})
mtx1
#      [,1]         [,2]         [,3]         [,4]        
# [1,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "Not 4"     
# [2,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "Not 4"     
# [3,] "Not 1"      "Somewhat 2" "Somewhat 3" "Somewhat 4"

Admittedly it leaves "Not" instead of "No", but you can either use more complicated regex, or just change it separately:

apply(mtx1, 2, function(x){gsub('Not', 'No', x)})
#      [,1]         [,2]         [,3]         [,4]        
# [1,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "No 4"      
# [2,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "No 4"      
# [3,] "No 1"       "Somewhat 2" "Somewhat 3" "Somewhat 4"

Wrap with as.data.frame (or your favorite version) if you need data.frames instead of matrices.

Note that if you data is in factors, it will be more efficient to run the same regex on the levels instead of the actual data.

Question

2 answers

solution1
1 2016-03-19 23:53:24

solution2
0 2016-03-19 23:49:59

Question

2 answers

solution1 1 2016-03-19 23:53:24

solution2 0 2016-03-19 23:49:59

solution1
1 2016-03-19 23:53:24

solution2
0 2016-03-19 23:49:59