![](/img/trans.png)
[英]R data.table to calculate a formula using a column as a variable across levels of a factor
[英]Return all factor levels by name as new columns from a three column data.table [R]
有沒有辦法使用data.table或dplyr來解決下面的問題?
library(data.table)
(DT = data.table(a = LETTERS[c(1, 1:3, 8)], b = c(2, 4:7),
c = as.factor(c("bob", "mary", "bob", "george", "alice")), key="a"))
返回:
# a b c
# 1: A 2 bob
# 2: A 4 mary
# 3: B 5 bob
# 4: C 6 george
# 5: H 7 alice
想得到這個:
# alice bob george mary
# 1: A NA 2 NA NA
# 2: A NA NA NA 4
# 3: B NA 5 NA NA
# 4: C NA NA 6 NA
# 5: H 7 NA NA NA
這類似於創建虛擬變量 。
uc <- sort(unique(as.character(DT$c)))
DT[,(uc):=lapply(uc,function(x)ifelse(c==x,b,NA))][,c('b','c'):=NULL]
我聽說過ifelse
,所以可能會有更快的路線
uc <- sort(unique(as.character(DT$c)))
is <- 1:nrow(DT)
js <- as.character(DT$c)
vs <- DT$b
DT[,(uc):=NA_integer_]
for (i in is) set(DT,i=is[i],j=js[i],value=vs[i])
DT[,c('b','c'):=NULL]
只是使用Frank的虛擬變量的想法:
df1 <- cbind( a = DT$a, as.data.frame( model.matrix(a ~ c - 1, data = DT ) * DT$b ))
df1[df1==0] <- NA
names(df1) <- c("a", levels(DT$c))
# a alice bob george mary
# 1 A NA 2 NA NA
# 2 A NA NA NA 4
# 3 B NA 5 NA NA
# 4 C NA NA 6 NA
# 5 H 7 NA NA NA
基數R:
names <- unique(as.character(DT$c))
cbind(a = DT$a, as.data.frame(sapply(names, function(x) ifelse(DT$c==x, DT$b, NA))))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.