将变量重新编码为多个新值

Question

I have a dataset that contains observed scores for a group of people, like this: 我有一个数据集，其中包含一组人的观察分数，如下所示：

person_id <- c(1:50)
person_score <- rep(1:10,5)
people <- data.frame(person_id, person_score)

I need to create a set of new variables that are recoded values of the observed scores. 我需要创建一组新变量，这些变量是观察分数的重新编码值。 I have a set of variables that are the "keys" for transforming the observed scores to the new variables, like this: 我有一组变量，它们是将观察到的分数转换为新变量的“关键”，像这样：

observed <- c(1,2,3,4,5,6,7,8,9,10)
score1 <- c(10,14,17,18,20,21,22,26,28,31)
score2 <- c(6,9,11,14,17,18,20,24,25,26)
score3 <- c(11,13,15,17,19,21,23,25,27,29)
score4 <- c(43,44,45,46,47,48,49,50,51,52)
scores <- data.frame(observed,score1,score2, score3, score4)

...where the first value corresponds to observed score = 1, the second value corresponds to observed score = 2, and so on. ...其中第一个值对应于观测分数= 1，第二个值对应于观测分数= 2，依此类推。

I need to create four new variables that correspond to score1, score2, score3, and score 4. I can think of doing the recoding manually, as shown below, but it is very slow and tedious: 我需要创建四个分别对应于score1，score2，score3和score 4的变量。我可以考虑手动进行重新编码，如下所示，但这非常缓慢且乏味：

people$value1[person_score == 1] <- 10
people$value1[person_score == 2] <- 14

...and so on for score1 ...以此类推

people$value2[person_score == 1] <- 6
people$value2[person_score == 2] <- 9

...and so on for score2 ...以此类推

people$value3[person_score == 1] <- 11
people$value3[person_score == 2] <- 13

...and so on for score3 ...以此类推

people$value4[person_score == 1] <- 43
people$value4[person_score == 2] <- 44

...and so on for score4 ...以此类推

Answer 1

I would just use match to find the correct rows from the scores data.frame ... 我只是使用match从成绩data.frame找到正确的行...

idx <- match( people$person_score , scores$observed )

people_new <- cbind( people , scores[ idx , -1 ] )

head(people_new)
#  person_id person_score score1 score2 score3 score4
#1         1            1     10      6     11     43
#2         2            2     14      9     13     44
#3         3            3     17     11     15     45
#4         4            4     18     14     17     46
#5         5            5     20     17     19     47
#6         6            6     21     18     21     48

Answer 2

You could use the qdap package's lookup function as follows: 您可以按以下方式使用qdap软件包的 lookup功能：

## person_id <- c(1:50)
## person_score <- rep(1:10,5)
## people <- data.frame(person_id, person_score)
## 
## observed <- c(1,2,3,4,5,6,7,8,9,10)
## score1 <- c(10,14,17,18,20,21,22,26,28,31)
## score2 <- c(6,9,11,14,17,18,20,24,25,26)
## score3 <- c(11,13,15,17,19,21,23,25,27,29)
## score4 <- c(43,44,45,46,47,48,49,50,51,52)
## scores <- data.frame(observed,score1,score2, score3, score4)

library(qdap)
people[, 3:6] <- lapply(scores[, -1], function(x) lookup(people$person_score, scores[, 1], x))

people
##    person_id person_score score1 score2 score3 score4
## 1          1            1     10      6     11     43
## 2          2            2     14      9     13     44
## 3          3            3     17     11     15     45
## 4          4            4     18     14     17     46
## 5          5            5     20     17     19     47
## 6          6            6     21     18     21     48
## 7          7            7     22     20     23     49
.
.
.
## 50        50           10     31     26     29     52

Answer 3

It is just a join of the two data.frames: you can use merge 它只是两个data.frames的结合：可以使用merge

merge( people, scores, by.x = "person_score", by.y = "observed", all.x = TRUE )

or sqldf . 或sqldf 。

library(sqldf)
sqldf( "
  SELECT    *
  FROM      people
  LEFT JOIN scores
  ON        people.person_score = scores.observed
" )

将变量重新编码为多个新值

问题描述

3 个解决方案

解决方案1
1 已采纳 2013-07-15 20:29:16

解决方案2
0 2013-07-15 20:26:35

解决方案3
0 2013-07-15 20:39:55

将变量重新编码为多个新值

问题描述

3 个解决方案

解决方案1 1 已采纳 2013-07-15 20:29:16

解决方案2 0 2013-07-15 20:26:35

解决方案3 0 2013-07-15 20:39:55

解决方案1
1 已采纳 2013-07-15 20:29:16

解决方案2
0 2013-07-15 20:26:35

解决方案3
0 2013-07-15 20:39:55