I'm wondering how can I combine strings from two different columns in R when there are multiple unordered variables? Specifically, if I have a set of data like this:
1 | R~^~C 4~^~5
2 | L~^~C~^~S 5~^~5~^~5
3 | S~^~R 5~^~4
4 | V~^~L~^~S~^~R~^~C 5~^~4~^~5~^~3~^~5
...
How can I group them and get a new table like:
R C L S V
1 4 5 na na na
2 na 5 5 5 na
...
?
Thank you in advance!
Need a working example:
txt <- "1 | R~^~C 4~^~5
2 | L~^~C~^~S 5~^~5~^~5
3 | S~^~R 5~^~4
4 | V~^~L~^~S~^~R~^~C 5~^~4~^~5~^~3~^~5"
d <- read.table(text=txt)
Them build a dataframe to hold the values (in the 4th column) that will be determined by the column names (in the 3rd column):
colnames <- sapply( gsub("~^~", "," , as.character(d$V3), fixed=TRUE),
function(x)scan(text=x, what="", sep=",") )
values <- sapply( gsub("~^~", "," , as.character(d$V4), fixed=TRUE),
function(x)scan(text=x, what=numeric(), sep=",") )
target <- data.frame(NA,NA,NA,NA,NA) # Could vary the order without loss of generality
colnames(target) <- unique(unlist(colnames))
for ( i in seq_along(colnames) ){
target[i, colnames[[i]] ] <- values[[i]]}
> target
R C L S V
1 4 5 NA NA NA
2 NA 5 5 5 NA
3 4 NA NA 5 NA
4 3 5 4 5 5
d <- read.table(text = "1 | R~^~C 4~^~5
2 | L~^~C~^~S 5~^~5~^~5
3 | S~^~R 5~^~4
4 | V~^~L~^~S~^~R~^~C 5~^~4~^~5~^~3~^~5", as.is = TRUE)
colNames <- unique(unlist(strsplit(d$V3, '\\~\\^\\~')))
paired <- t(apply(d[, 3:4], 1, function(x){
spli <- strsplit(x, '\\~\\^\\~')
tab <- cbind(spli[[1]], spli[[2]])
out <- rep(NA, length(colNames))
out[match(spli[[1]], colNames)] <- spli[[2]]
names(out) <- colNames
return(out)
}))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.