[英]How to extract and combine strings from two different columns in R
我想知道當有多個無序變量時如何組合R中兩個不同列的字符串? 具體來說,如果我有一組這樣的數據:
1 | R~^~C 4~^~5
2 | L~^~C~^~S 5~^~5~^~5
3 | S~^~R 5~^~4
4 | V~^~L~^~S~^~R~^~C 5~^~4~^~5~^~3~^~5
...
我如何對它們進行分組並得到一個新表,如:
R C L S V
1 4 5 na na na
2 na 5 5 5 na
...
?
先感謝您!
需要一個有效的例子:
txt <- "1 | R~^~C 4~^~5
2 | L~^~C~^~S 5~^~5~^~5
3 | S~^~R 5~^~4
4 | V~^~L~^~S~^~R~^~C 5~^~4~^~5~^~3~^~5"
d <- read.table(text=txt)
他們構建一個數據框來保存將由列名稱確定的值(在第4列中)(在第3列中):
colnames <- sapply( gsub("~^~", "," , as.character(d$V3), fixed=TRUE),
function(x)scan(text=x, what="", sep=",") )
values <- sapply( gsub("~^~", "," , as.character(d$V4), fixed=TRUE),
function(x)scan(text=x, what=numeric(), sep=",") )
target <- data.frame(NA,NA,NA,NA,NA) # Could vary the order without loss of generality
colnames(target) <- unique(unlist(colnames))
for ( i in seq_along(colnames) ){
target[i, colnames[[i]] ] <- values[[i]]}
> target
R C L S V
1 4 5 NA NA NA
2 NA 5 5 5 NA
3 4 NA NA 5 NA
4 3 5 4 5 5
d <- read.table(text = "1 | R~^~C 4~^~5
2 | L~^~C~^~S 5~^~5~^~5
3 | S~^~R 5~^~4
4 | V~^~L~^~S~^~R~^~C 5~^~4~^~5~^~3~^~5", as.is = TRUE)
colNames <- unique(unlist(strsplit(d$V3, '\\~\\^\\~')))
paired <- t(apply(d[, 3:4], 1, function(x){
spli <- strsplit(x, '\\~\\^\\~')
tab <- cbind(spli[[1]], spli[[2]])
out <- rep(NA, length(colNames))
out[match(spli[[1]], colNames)] <- spli[[2]]
names(out) <- colNames
return(out)
}))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.