简体   繁体   中英

How to extract and combine strings from two different columns in R

I'm wondering how can I combine strings from two different columns in R when there are multiple unordered variables? Specifically, if I have a set of data like this:

1 | R~^~C                  4~^~5

2 | L~^~C~^~S            5~^~5~^~5

3 | S~^~R                    5~^~4

4 | V~^~L~^~S~^~R~^~C        5~^~4~^~5~^~3~^~5

...

How can I group them and get a new table like:

  R  C  L  S  V

1 4  5  na na na

2 na 5  5  5  na

...

?

Thank you in advance!

Need a working example:

txt <- "1 | R~^~C                  4~^~5
2 | L~^~C~^~S            5~^~5~^~5
3 | S~^~R                    5~^~4
4 | V~^~L~^~S~^~R~^~C        5~^~4~^~5~^~3~^~5"
 d <- read.table(text=txt)

Them build a dataframe to hold the values (in the 4th column) that will be determined by the column names (in the 3rd column):

colnames <- sapply( gsub("~^~", "," , as.character(d$V3), fixed=TRUE), 
                     function(x)scan(text=x, what="", sep=",") )
values <- sapply( gsub("~^~", "," , as.character(d$V4), fixed=TRUE), 
                   function(x)scan(text=x, what=numeric(), sep=",") )

target <- data.frame(NA,NA,NA,NA,NA) # Could vary the order without loss of generality
colnames(target) <- unique(unlist(colnames))
for ( i in seq_along(colnames) ){ 
                         target[i, colnames[[i]] ] <- values[[i]]}
> target
   R  C  L  S  V
1  4  5 NA NA NA
2 NA  5  5  5 NA
3  4 NA NA  5 NA
4  3  5  4  5  5
d <- read.table(text = "1 | R~^~C                  4~^~5
2 | L~^~C~^~S            5~^~5~^~5
3 | S~^~R                    5~^~4
4 | V~^~L~^~S~^~R~^~C        5~^~4~^~5~^~3~^~5", as.is = TRUE)

colNames <- unique(unlist(strsplit(d$V3, '\\~\\^\\~')))

paired <- t(apply(d[, 3:4], 1, function(x){
  spli <- strsplit(x, '\\~\\^\\~')
  tab <- cbind(spli[[1]], spli[[2]])
  out <- rep(NA, length(colNames))
  out[match(spli[[1]], colNames)] <- spli[[2]]
  names(out) <- colNames
  return(out)
}))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM