简体   繁体   English

RegEx需要在“ |”而不是“ \\ |”上分割javascript字符串

[英]RegEx needed to split javascript string on “|” but not “\|”

We would like to split a string on instances of the pipe character | 我们愿在管道符的情况下分割字符串| , but not if that character is preceded by an escape character, eg \\| ,但是如果该字符前面带有转义字符(例如\\| .

ex we would like to see the following string split into the following components 例如,我们希望将以下字符串拆分为以下组件

1|2|3\|4|5

1
2
3\|4
5

I'm expecting to be able to use the following javascript function, split, which takes a regular expression. 我希望能够使用以下带正则表达式的javascript函数split。 What regex would I pass to split? 我将通过什么正则表达式拆分? We are cross platform and would like to support current and previous versions (1 version back) of IE, FF, and Chrome if possible. 我们是跨平台的,如果可能的话,我们希望支持IE,FF和Chrome的当前和先前版本(支持1个新版本)。

Instead of a split, do a global match (the same way a lexical analyzer would): 而不是拆分,而是进行全局匹配(与词法分析器相同):

  • match anything other than \\\\ or | 匹配\\\\|以外的任何内容
  • or match any escaped char 或匹配任何转义的字符

Something like this: 像这样:

var str = "1|2|3\\|4|5";
var matches = str.match(/([^\\|]|\\.)+/g);

A quick explanation: ([^\\\\|]|\\\\.) matches either any character except '\\' and '|' 快速说明: ([^\\\\|]|\\\\.)匹配除'\\''|'以外'\\'任何字符 (pattern: [^\\\\|] ) or (pattern: | ) it matches any escaped character (pattern: \\\\. ). (模式: [^\\\\|] )或(模式: | ),它与任何转义字符(模式: \\\\. )匹配。 The + after it tells it to match the previous once or more: the pattern ([^\\\\|]|\\\\.) will therefor be matches once or more. 后面的+它与先前的匹配一次或多次:模式([^\\\\|]|\\\\.)将因此匹配一次或多次。 The g at the end of the regex literal tells the JavaScript regex engine to match the pattern globally instead of matching it just once. regex文字末尾的g告诉JavaScript regex引擎全局匹配模式而不是只匹配一次。

What you're looking for is a "negative look-behind matching regular expression". 您正在寻找的是“负向后匹配正则表达式”。

This isn't pretty, but it should split the list for you: 这不是很漂亮,但是应该为您拆分列表:

var output = input.replace(/(\\)?|/g, function($0,$1){ return $1?$1:$0+'\n';});

This will take your input string and replace all of the '|' 这将使用您的输入字符串并替换所有的'|' characters NOT immediately preceded by a '\\' character and replace them with '\\n' characters. 字符后不立即带有“ \\”字符,并用“ \\ n”字符替换它们。

A regex solution was posted as I was looking into this. 正当我对此进行调查时,发布了一个正则表达式解决方案。 So I just went ahead and wrote one without it. 所以我只是继续写一个没有它的东西。 I did some simple benchmarks and it is -slightly- faster (I expected it to be slower...). 我做了一些简单的基准测试,它稍微快一些(我希望它会慢一点……)。

Without using Regex, if I understood what you desire, this should do the job: 如果不使用Regex,如果我了解您的需求,那么就可以完成此工作:

function doSplit(input) {
    var output = [];
    var currPos = 0,
        prevPos = -1;
    while ((currPos = input.indexOf('|', currPos + 1)) != -1) {
        if (input[currPos-1] == "\\") continue;
        var recollect = input.substr(prevPos + 1, currPos - prevPos - 1);
        prevPos = currPos;
        output.push(recollect);
    }
    var recollect = input.substr(prevPos + 1);
    output.push(recollect);
    return output;
}
doSplit('1|2|3\\|4|5'); //returns [ '1', '2', '3\\|4', '5' ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM