简体   繁体   English

R:根据索引列表将一些字符串向量元素粘贴在一起

[英]R: Paste together some string vector elements based on list of indexes

I have a string vector like this: 我有一个像这样的字符串向量:

x <- c("ermanaric cayce nonwashable climactical outseeing dorr nubble",
       "aver unsegregating preprofess lumme noontime triskele",
       "riverbank walachian penza",
       "schlieren calthrop",
       "hutlike paraphyllium unservile chaplainship bordelaise",
       "phlogotic strategics jowlier orthopaedic nonprofiteering",
       "vizir rudenture shopkeeper",
       "interestuarine sardis",
       "anthas figuring",
       "unphased engle german emporium organometallic didy uneclipsing",
       "bronzy conant reballot",
       "extrados facinorous acrolithic",
       "paralyzation uningratiating enzymatically enuresis",
       "unscholastic extemporarily",
       "discipleship fossilize summae",
       "concretize intercharge palpate gombroon initiatrices",
       "intimation progressiveness",
       "unpictorialise",
       "romanticization",
       "wynnewood",
       "unmate libratory polysynthetic")

Some of the elements need to be pasted together to form longer strings. 有些元素需要粘贴在一起才能形成更长的字符串。 I have a list of vectors that contains indexes of the elements that have to be pasted together: 我有一个向量列表,其中包含必须粘贴在一起的元素的索引:

indx <- list(c(3, 4), c(7, 8, 9), c(11, 12), c(14, 15), c(17, 18, 19, 20))

That is, the 3rd and the 4th element have to be pasted together to form the string "riverbank walachian penza schlieren calthrop" , the 7th, 8th and 9th string to be pasted together to form the string "vizir rudenture shopkeeper interestuarine sardis anthas figuring" and so on ( EDIT ), keeping the rest of the strings in the same order. 也就是说,必须将第3和第4个元素粘贴在一起形成字符串"riverbank walachian penza schlieren calthrop" ,将第7,第8和第9个字符串粘贴在一起形成字符串"vizir rudenture shopkeeper interestuarine sardis anthas figuring"等等( 编辑 ),保持其余字符串的顺序相同。 The resulting vector of strings would look like this: 生成的字符串向量如下所示:

y <- c("ermanaric cayce nonwashable climactical outseeing dorr nubble",
       "aver unsegregating preprofess lumme noontime triskele",
       "riverbank walachian penza schlieren calthrop",
       "hutlike paraphyllium unservile chaplainship bordelaise",
       "phlogotic strategics jowlier orthopaedic nonprofiteering",
       "vizir rudenture shopkeeper interestuarine sardis anthas figuring",
       "unphased engle german emporium organometallic didy uneclipsing",
       "bronzy conant reballot extrados facinorous acrolithic",
       "paralyzation uningratiating enzymatically enuresis",
       "unscholastic extemporarily discipleship fossilize summae",
       "concretize intercharge palpate gombroon initiatrices",
       "intimation progressiveness unpictorialise romanticization wynnewood",
       "unmate libratory polysynthetic")

I tried the following without any success: 我尝试了以下但没有成功:

myfun <- function(obj, indx) {
  paste(obj)[length(indx)]
}

mapply(myfun, x, m)

Can someone help? 有人可以帮忙吗?

The fact that indx does not contain an entry for each item in x but you want each returned or merged somewhere makes this a bit more challenging. 事实上, indx不包含x每个项目的条目,但是您希望每个项目都返回或合并到某处,这使得这更具挑战性。

One idea would be to successively update with Reduce , using temporary NA s to maintain the correspondence with index numbers. 一个想法是使用Reduce连续更新,使用临时NA来维持与索引号的对应关系。

my.y<-c(na.omit(Reduce(function(s,i) 
  replace(s,i,c(paste(s[i],collapse=" "),rep(NA,length(i)-1))),indx,x)))

identical(my.y,y)
#> [1] TRUE

Matches the desired output. 匹配所需的输出。

indx2 <- sapply(indx, '[', 1)
x[indx2] <- Map(function(x,y) paste(x[y], collapse=" "), list(x), indx)
unlist(x[sort(c(setdiff(seq_along(x), unlist(indx)),indx2))])
#  [1] "ermanaric cayce nonwashable climactical outseeing dorr nubble"      
#  [2] "aver unsegregating preprofess lumme noontime triskele"              
#  [3] "riverbank walachian penza schlieren calthrop"                       
#  [4] "hutlike paraphyllium unservile chaplainship bordelaise"             
#  [5] "phlogotic strategics jowlier orthopaedic nonprofiteering"           
#  [6] "vizir rudenture shopkeeper interestuarine sardis anthas figuring"   
#  [7] "unphased engle german emporium organometallic didy uneclipsing"     
#  [8] "bronzy conant reballot extrados facinorous acrolithic"              
#  [9] "paralyzation uningratiating enzymatically enuresis"                 
# [10] "unscholastic extemporarily discipleship fossilize summae"           
# [11] "concretize intercharge palpate gombroon initiatrices"               
# [12] "intimation progressiveness unpictorialise romanticization wynnewood"
# [13] "unmate libratory polysynthetic"
indx <- list(c(3, 4), c(7, 8, 9), c(11, 12), c(14, 15), c(17, 18, 19, 20))
indx1 <- c(lapply(setdiff(1:length(x),unlist(indx)),c),indx)

indx2 <- indx1[order(sapply(indx1,"[[",1))]

sapply(indx2,function(z) {paste(x[z],collapse = " ")})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM