简体   繁体   English

删除字符串中第二次出现的正则表达式

[英]Regular expression that remove second occurrence of a character in a string

I'm trying to write a JavaScript function that removes any second occurrence of a character using the regular expression. 我正在尝试编写一个JavaScript函数,该函数使用正则表达式删除第二次出现的字符。 Here is my function 这是我的功能

var removeSecondOccurrence = function(string) {
return string.replace(/(.*)\1/gi, '');
}

It's only removing consecutive occurrence. 它只是删除连续出现的事件。 I'd like it to remove even non consecutive one. 我希望它甚至删除不连续的一个。 for example papirana should become pairn. 例如纸莎草纸应该成对。

Please help 请帮忙

A non-regexp solution: 非正则表达式解决方案:

 "papirana".split("").filter(function(x, n, self) { return self.indexOf(x) == n }).join("")

Regexp code is complicated, because JS doesn't support lookbehinds: 正则表达式代码很复杂,因为JS不支持lookbehinds:

str = "papirana";
re = /(.)(.*?)\1/;
while(str.match(re)) str = str.replace(re, "$1$2")

or a variation of the first method: 或第一种方法的变体:

"papirana".replace(/./g, function(a, n, str) { return str.indexOf(a) == n ? a : "" })

Using a zero-width lookahead assertion you can do something similar 使用零宽度的超前断言,您可以执行类似的操作

"papirana".replace(/(.)(?=.*\1)/g, "")

returns 回报

"pirna"

The letters are of course the same, just in a different order. 字母当然是相同的,只是顺序不同。

Passing the reverse of the string and using the reverse of the result you can get what you're asking for. 传递字符串的倒数并使用结果的倒数,您可以得到所要求的。

This is how you would do it with a loop: 这是循环执行的方式:

var removeSecondOccurrence = function(string) {
    var results = "";
    for (var i = 0; i < string.length; i++)
        if (!results.contains(string.charAt(i)))
            results += string.charAt(i);
}

Basically: for each character in the input, if you haven't seen that character already, add it to the results. 基本上:对于输入中的每个字符,如果尚未看到该字符,请将其添加到结果中。 Clear and readable, at least. 至少清晰易读。

What Michelle said. 米歇尔怎么说。

In fact, I strongly suspect it cannot be done using regular expressions. 实际上,我强烈怀疑使用正则表达式无法做到这一点。 Or rather, you can if you reverse the string, remove all but the first occurences, then reverse again, but it's a dirty trick and what Michelle suggests is way better (and probably faster). 或者更确切地说,如果您反转字符串,除去第一次出现的所有字符串,然后再次反转,则可以,但这是一个肮脏的把戏,而Michelle的建议则更好(甚至可能更快)。

If you're still hot on regular expressions... 如果您仍然对正则表达式感兴趣...

"papirana".
    split("").
    reverse().
    join("").
    replace(/(.)(?=.*\1)/g, '').
    split("").
    reverse().
    join("")

// => "pairn"

The reason why you can't find all but the first occurence without all the flippage is twofold: 没有所有翻转,您只能找到第一个出现的原因是双重的:

  • JavaScript does not have lookbehinds, only lookaheads JavaScript没有后顾之忧,只有先行
  • Even if it did, I don't think any regexp flavour allows variable-length lookbehinds 即使是这样,我也不认为任何正则表达式都允许变长的lookbehinds

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM