[英]recode variable to multiple new values
我有一個數據集,其中包含一組人的觀察分數,如下所示:
person_id <- c(1:50)
person_score <- rep(1:10,5)
people <- data.frame(person_id, person_score)
我需要創建一組新變量,這些變量是觀察分數的重新編碼值。 我有一組變量,它們是將觀察到的分數轉換為新變量的“關鍵”,像這樣:
observed <- c(1,2,3,4,5,6,7,8,9,10)
score1 <- c(10,14,17,18,20,21,22,26,28,31)
score2 <- c(6,9,11,14,17,18,20,24,25,26)
score3 <- c(11,13,15,17,19,21,23,25,27,29)
score4 <- c(43,44,45,46,47,48,49,50,51,52)
scores <- data.frame(observed,score1,score2, score3, score4)
...其中第一個值對應於觀測分數= 1,第二個值對應於觀測分數= 2,依此類推。
我需要創建四個分別對應於score1,score2,score3和score 4的變量。我可以考慮手動進行重新編碼,如下所示,但這非常緩慢且乏味:
people$value1[person_score == 1] <- 10
people$value1[person_score == 2] <- 14
...以此類推
people$value2[person_score == 1] <- 6
people$value2[person_score == 2] <- 9
...以此類推
people$value3[person_score == 1] <- 11
people$value3[person_score == 2] <- 13
...以此類推
people$value4[person_score == 1] <- 43
people$value4[person_score == 2] <- 44
...以此類推
我只是使用match
從成績data.frame
找到正確的行...
idx <- match( people$person_score , scores$observed )
people_new <- cbind( people , scores[ idx , -1 ] )
head(people_new)
# person_id person_score score1 score2 score3 score4
#1 1 1 10 6 11 43
#2 2 2 14 9 13 44
#3 3 3 17 11 15 45
#4 4 4 18 14 17 46
#5 5 5 20 17 19 47
#6 6 6 21 18 21 48
您可以按以下方式使用qdap軟件包的 lookup
功能:
## person_id <- c(1:50)
## person_score <- rep(1:10,5)
## people <- data.frame(person_id, person_score)
##
## observed <- c(1,2,3,4,5,6,7,8,9,10)
## score1 <- c(10,14,17,18,20,21,22,26,28,31)
## score2 <- c(6,9,11,14,17,18,20,24,25,26)
## score3 <- c(11,13,15,17,19,21,23,25,27,29)
## score4 <- c(43,44,45,46,47,48,49,50,51,52)
## scores <- data.frame(observed,score1,score2, score3, score4)
library(qdap)
people[, 3:6] <- lapply(scores[, -1], function(x) lookup(people$person_score, scores[, 1], x))
people
## person_id person_score score1 score2 score3 score4
## 1 1 1 10 6 11 43
## 2 2 2 14 9 13 44
## 3 3 3 17 11 15 45
## 4 4 4 18 14 17 46
## 5 5 5 20 17 19 47
## 6 6 6 21 18 21 48
## 7 7 7 22 20 23 49
.
.
.
## 50 50 10 31 26 29 52
它只是兩個data.frames的結合:可以使用merge
merge( people, scores, by.x = "person_score", by.y = "observed", all.x = TRUE )
或sqldf
。
library(sqldf)
sqldf( "
SELECT *
FROM people
LEFT JOIN scores
ON people.person_score = scores.observed
" )
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.