[英]Replace Nth occurrence of a character in a string with something else
考虑a = paste(1:10,collapse=", ")
结果
a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
我想替换每第 n 次(比如第 4 次)出现的“,”并将其替换为其他内容(比如“\n”)。 所需的输出将是:
"1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
我正在寻找使用gsub
(或等效的东西)和某种形式的regular expression
来实现此目标的代码。
您可以将((?:\\d+, ){3}\\d),
替换为\\1\\n
您基本上会捕获所有内容,直到group1中的第四个逗号为止,然后分别用逗号分隔,并用\\1\\n
替换,后者将匹配的文本替换为group1文本和换行符,从而提供了预期的结果。
gsub("((?:\\d+, ){3}\\d),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")
印刷品
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
编辑:
为了将上述解决方案推广到任何文本,我们可以将\\d
更改为[^,]
gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")
gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "a, bb, ccc, dddd, 500, 600, 700, 800, 900, 1000")
输出,
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
[1] "a, bb, ccc, dddd\n 500, 600, 700, 800\n 900, 1000"
同时使用regex
和gsub
。
a = paste(1:10,collapse=", ")
x <- gsub("([^,]*,[^,]*,[^,]*,[^,]*),", '\\1\n', a)
x
#> [1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
正则表达式是最好的选择,不过这是没有正则表达式的另一种方法
> str_vec <- strsplit(a, " ")[[1]]
> where <- seq_along(str_vec) %% 4 == 0
> str_vec[where] <- sub(",", "\n", str_vec[where])
> paste(str_vec, collapse=" ")
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
regmatches
作为另一种选择:
a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
fn <- ","
rp <- "\n"
n <- 4
regmatches(a, gregexpr(fn, a)) <- list(c(rep(fn,n-1),rp))
a
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
作为功能:
a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
replN <- function(x, fn, rp, n) {
regmatches(x, gregexpr(fn, x)) <- list(c(rep(fn,n-1),rp))
x
}
replN(a, ",", "\n", 4)
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10
您甚至可以将其扩展为替换参数的向量化:
a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
replN <- function(x,fn,rp,n) {
sel <- rep(fn, n*length(rp))
sel[seq_along(rp)*n] <- rp
regmatches(x, gregexpr(fn, x)) <- list(sel)
x
}
replN(a, fn=",", rp=c("1st","2nd"), n=4)
#[1] "1, 2, 3, 41st 5, 6, 7, 82nd 9, 10"
这个可以用字符串而不是字符替换。 我做了一个你可以轻松使用的功能:)
> a = paste(1:10,collapse=", ")
> a
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
> # if you want the 2nd occurence
> gsub("(.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10"
> # if you want the 3rd occurence
> gsub("(.*?,.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10"
> # if you want the 4rd occurence
> gsub("(.*?,.*?,.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10"
> # if you want the last occurence
> gsub("(.*,.*),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10"
>
>
> replace.occurence <- function(x, pattern, replacement, which.occu) {
+ if( which.occu == "last" ) {
+ gsub(paste0("(.*", pattern, ".*)", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x)
+ } else {
+ gsub(paste0("(.*?", paste0(rep(paste0(pattern, ".*?"), which.occu - 1), collapse = ""), ")", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x)
+ }
+ }
>
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 2)
[1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 3)
[1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 4)
[1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = "last")
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10"
>
> replace.occurence(a, pattern = ", 3, 4,", replacement = ", 4, 3,", which.occu = 1)
[1] "1, 2, 4, 3, 5, 6, 7, 8, 9, 10"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.