簡體   English   中英

如何從R中的兩個不同列提取和組合字符串

[英]How to extract and combine strings from two different columns in R

我想知道當有多個無序變量時如何組合R中兩個不同列的字符串? 具體來說,如果我有一組這樣的數據:

1 | R~^~C                  4~^~5

2 | L~^~C~^~S            5~^~5~^~5

3 | S~^~R                    5~^~4

4 | V~^~L~^~S~^~R~^~C        5~^~4~^~5~^~3~^~5

...

我如何對它們進行分組並得到一個新表,如:

  R  C  L  S  V

1 4  5  na na na

2 na 5  5  5  na

...

先感謝您!

需要一個有效的例子:

txt <- "1 | R~^~C                  4~^~5
2 | L~^~C~^~S            5~^~5~^~5
3 | S~^~R                    5~^~4
4 | V~^~L~^~S~^~R~^~C        5~^~4~^~5~^~3~^~5"
 d <- read.table(text=txt)

他們構建一個數據框來保存將由列名稱確定的值(在第4列中)(在第3列中):

colnames <- sapply( gsub("~^~", "," , as.character(d$V3), fixed=TRUE), 
                     function(x)scan(text=x, what="", sep=",") )
values <- sapply( gsub("~^~", "," , as.character(d$V4), fixed=TRUE), 
                   function(x)scan(text=x, what=numeric(), sep=",") )

target <- data.frame(NA,NA,NA,NA,NA) # Could vary the order without loss of generality
colnames(target) <- unique(unlist(colnames))
for ( i in seq_along(colnames) ){ 
                         target[i, colnames[[i]] ] <- values[[i]]}
> target
   R  C  L  S  V
1  4  5 NA NA NA
2 NA  5  5  5 NA
3  4 NA NA  5 NA
4  3  5  4  5  5
d <- read.table(text = "1 | R~^~C                  4~^~5
2 | L~^~C~^~S            5~^~5~^~5
3 | S~^~R                    5~^~4
4 | V~^~L~^~S~^~R~^~C        5~^~4~^~5~^~3~^~5", as.is = TRUE)

colNames <- unique(unlist(strsplit(d$V3, '\\~\\^\\~')))

paired <- t(apply(d[, 3:4], 1, function(x){
  spli <- strsplit(x, '\\~\\^\\~')
  tab <- cbind(spli[[1]], spli[[2]])
  out <- rep(NA, length(colNames))
  out[match(spli[[1]], colNames)] <- spli[[2]]
  names(out) <- colNames
  return(out)
}))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM