简体   繁体   English

用其他内容替换字符串中第 N 次出现的字符

[英]Replace Nth occurrence of a character in a string with something else

Consider a = paste(1:10,collapse=", ") which results in考虑a = paste(1:10,collapse=", ")结果

a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

I would like to replace every n-th (say 4-th) occurrences of "," and replace it with something else (say "\n").我想替换每第 n 次(比如第 4 次)出现的“,”并将其替换为其他内容(比如“\n”)。 The desired output would be:所需的输出将是:

"1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

I am looking for a code that uses gsub (or something equivalent) and some form of regular expression to achieve this goal.我正在寻找使用gsub (或等效的东西)和某种形式的regular expression来实现此目标的代码。

You can replace ((?:\\d+, ){3}\\d), with \\1\\n 您可以将((?:\\d+, ){3}\\d),替换为\\1\\n

You basically capture everything till fourth comma in group1 and comma separately and replace it with \\1\\n which replaces matched text with group1 text and newline, giving you the intended results. 您基本上会捕获所有内容,直到group1中的第四个逗号为止,然后分别用逗号分隔,并用\\1\\n替换,后者将匹配的文本替换为group1文本和换行符,从而提供了预期的结果。

Regex Demo 正则表达式演示

R Code demo R代码演示

gsub("((?:\\d+, ){3}\\d),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")

Prints, 印刷品

[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

Edit: 编辑:

To generalize above solution to any text, we can change \\d to [^,] 为了将上述解决方案推广到任何文本,我们可以将\\d更改为[^,]

New R code demo 新的R代码演示

gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "1, 2, 3, 4, 5, 6, 7, 8, 9, 10")
gsub("((?:[^,]+, ){3}[^,]+),", "\\1\n", "a, bb, ccc, dddd, 500, 600, 700, 800, 900, 1000")

Output, 输出,

[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"
[1] "a, bb, ccc, dddd\n 500, 600, 700, 800\n 900, 1000"

Using both regex and gsub . 同时使用regexgsub

a = paste(1:10,collapse=", ")
x <- gsub("([^,]*,[^,]*,[^,]*,[^,]*),", '\\1\n', a)
x
#> [1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

regex is the best alternative, nontheless here's another approach without regex 正则表达式是最好的选择,不过这是没有正则表达式的另一种方法

> str_vec <- strsplit(a, " ")[[1]] 
> where <- seq_along(str_vec) %% 4 == 0
> str_vec[where] <- sub(",", "\n", str_vec[where])
> paste(str_vec, collapse=" ")
[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

regmatches as yet another alternative: regmatches作为另一种选择:

a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

fn <- ","
rp <- "\n"
n <- 4

regmatches(a, gregexpr(fn, a)) <- list(c(rep(fn,n-1),rp))
a
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10"

As a function: 作为功​​能:

a <- "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

replN <- function(x, fn, rp, n) {
    regmatches(x, gregexpr(fn, x)) <- list(c(rep(fn,n-1),rp))
    x
}
replN(a, ",", "\n", 4)
#[1] "1, 2, 3, 4\n 5, 6, 7, 8\n 9, 10

You could even extend this to be vectorised over the replacement argument: 您甚至可以将其扩展为替换参数的向量化:

a = "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"

replN <- function(x,fn,rp,n) {
    sel <- rep(fn, n*length(rp))
    sel[seq_along(rp)*n] <- rp
    regmatches(x, gregexpr(fn, x)) <- list(sel)
    x
}
replN(a, fn=",", rp=c("1st","2nd"), n=4)
#[1] "1, 2, 3, 41st 5, 6, 7, 82nd 9, 10"

This one can be replace with a string instead of a character.这个可以用字符串而不是字符替换。 I did a function that you can use easily:)我做了一个你可以轻松使用的功能:)

A demo here to understand the regex此处演示以了解正则表达式

> a = paste(1:10,collapse=", ")
> a
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
> # if you want the 2nd occurence
> gsub("(.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10"
> # if you want the 3rd occurence
> gsub("(.*?,.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10"
> # if you want the 4rd occurence
> gsub("(.*?,.*?,.*?,.*?),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10"
> # if you want the last occurence
> gsub("(.*,.*),(.*)", "\\1\n\\2", a)
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10"
> 
> 
> replace.occurence <- function(x, pattern, replacement, which.occu) {
+   if( which.occu == "last" ) {
+     gsub(paste0("(.*", pattern, ".*)", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x)
+   } else {
+     gsub(paste0("(.*?", paste0(rep(paste0(pattern, ".*?"), which.occu - 1), collapse = ""), ")", pattern, "(.*)"), paste0("\\1", replacement, "\\2"), x)
+   }
+ }
> 
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 2)
[1] "1, 2\n 3, 4, 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 3)
[1] "1, 2, 3\n 4, 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = 4)
[1] "1, 2, 3, 4\n 5, 6, 7, 8, 9, 10"
> replace.occurence(a, pattern = ",", replacement = "\n", which.occu = "last")
[1] "1, 2, 3, 4, 5, 6, 7, 8, 9\n 10"
> 
> replace.occurence(a, pattern = ", 3, 4,", replacement = ", 4, 3,", which.occu = 1)
[1] "1, 2, 4, 3, 5, 6, 7, 8, 9, 10"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM