简体   繁体   中英

recode many variables to create new dataframe in r

I need to create a dataframe whose variables are recoded values of another dataframe.

The data matrix has a column of people who are rated by a group of raters plus an expert rater. Here's what the data structure looks like (these are just made-up values):

person <- c(1:10)
rater.1 <- c(2,3,2,3,4,3,4,2,3,3)
rater.2 <- c(4,3,2,3,1,2,3,2,3,1)
rater.3 <- c(3,2,3,1,2,2,2,3,1,2)
rater.4 <- c(3,4,3,4,3,4,2,2,3,2)
expert.rater <- c(4,4,2,3,1,2,1,2,2,2)

ratings <- data.frame(person,rater.1,rater.2, rater.3, rater.4, expert.rater)

Except in my real data set I have 131 raters and 400 people.

I need to compare each rater to the expert and make a new dataframe of the difference scores. I can think of doing it this way, except it is very tedious and probably not a good idea:

rater.1_a <- abs(rater.1 - expert.rater)
rater.2_a <- abs(rater.2 - expert.rater)
rater.3_a <- abs(rater.3 - expert.rater)
rater.4_a <- abs(rater.4 - expert.rater)

difference <- data.frame(person,rater.1_a,rater.2_a, rater.3_a, rater.4_a)

Is there a quicker way to create the 131 new rater.x_a variables?

Why not just:

abs(ratings[,2:5] - ratings[,6])
   rater.1 rater.2 rater.3 rater.4
1        2       0       1       1
2        1       1       2       0
3        0       0       1       1
4        0       0       2       1
5        3       0       1       2
6        1       0       0       2
7        3       2       1       1
8        0       0       1       0
9        1       1       1       1
10       1       1       0       0

(And if your data is large, and all numeric, it might be faster to do this using a matrix rather than a data frame.)

This will create a matrix of 'difference scores':

> ToCalc = ratings[,grep("rater\\.", names(ratings))]
> Result = apply(ToCalc, 2, function(X) abs(X - ratings$expert.rater))

          rater.1 rater.2 rater.3 rater.4
 [1,]       2       0       1       1
 [2,]       1       1       2       0
 [3,]       0       0       1       1
 [4,]       0       0       2       1
 [5,]       3       0       1       2
 [6,]       1       0       0       2
 [7,]       3       2       1       1
 [8,]       0       0       1       0
 [9,]       1       1       1       1
[10,]       1       1       0       0

Then to match formats with the first frame:

Result = data.frame(person=ratings$person, Result, expert.rater=ratings$expert.rater)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM