![](/img/trans.png)
[英]Subset dataframe in R, dplyr filter row values of column A not NA in row of column B
[英]collating values on a single row in R dataframe dplyr
我有一些學生考試分數的數據:
MAPPING PupilMatchingRefAnonymous POINTS
1 PHYS 1 60
2 COMP 1 40
3 ENGL 1 20
4 MATH 1 80
我想將每個學生的數學和英語成績添加到他們的每個考試中,以便於比較:
MAPPING PupilMatchingRefAnonymous POINTS MATH ENGL
1 PHYS 1 60 80 20
2 COMP 1 40 80 20
3 ENGL 1 20 80 20
4 MATH 1 80 80 20
我嘗試了以下代碼,但是沒有運氣:
comResults %>%
select(MAPPING, PupilMatchingRefAnonymous, POINTS) %>%
group_by(PupilMatchingRefAnonymous) %>%
mutate(MATH=ifelse(MAPPING=="MATH", POINTS, NA))
Error: incompatible types, expecting a numeric vector
知道我應該嘗試什么嗎?
使用base,這似乎很簡單
df[as.character(df$MAPPING)] <- rep(df$POINTS, each = nrow(df))
df
# MAPPING PupilMatchingRefAnonymous POINTS PHYS COMP ENGL MATH
# 1 PHYS 1 60 60 40 20 80
# 2 COMP 1 40 60 40 20 80
# 3 ENGL 1 20 60 40 20 80
# 4 MATH 1 80 60 40 20 80
我不確定dplyr如何處理合並,但是此base-R解決方案會產生結果(名稱減少,修復起來應該很簡單:)
merge(merge(dat, dat[dat$MAPPING=="MATH", -1], by='PupilMatchingRefAnonymous'),
dat[dat$MAPPING=="ENGL", -1] , by='PupilMatchingRefAnonymous')
#--------
PupilMatchingRefAnonymous MAPPING POINTS.x POINTS.y POINTS
1 1 PHYS 60 80 20
2 1 COMP 40 80 20
3 1 ENGL 20 80 20
4 1 MATH 80 80 20
這是兩個學生的數據集,需要進一步測試:
dput(dat)
structure(list(MAPPING = structure(c(4L, 1L, 2L, 3L, 4L, 1L,
2L, 3L), .Label = c("COMP", "ENGL", "MATH", "PHYS"), class = "factor"),
PupilMatchingRefAnonymous = c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), POINTS = c(60L, 40L, 20L, 80L, 20L, 40L, 0L, 80L)), .Names = c("MAPPING",
"PupilMatchingRefAnonymous", "POINTS"), class = "data.frame", row.names = c(NA,
-8L))
我認為您正在嘗試將其從長格式轉換為寬格式,對嗎?
如果是這樣,請嘗試以下操作:
library(tidyr)
new.df <- comResults %>%
spread(MAPPING, POINTS)
這將使1個學生成為1行,而他們的所有學術信息都在同一行中。 我知道您只需要數學和英語,但是也許這段代碼可以使您步入正軌。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.