[英]RegEx needed to split javascript string on “|” but not “\|”
We would like to split a string on instances of the pipe character |
我们愿在管道符的情况下分割字符串
|
, but not if that character is preceded by an escape character, eg \\|
,但是如果该字符前面带有转义字符(例如
\\|
. 。
ex we would like to see the following string split into the following components 例如,我们希望将以下字符串拆分为以下组件
1|2|3\|4|5
1
2
3\|4
5
I'm expecting to be able to use the following javascript function, split, which takes a regular expression. 我希望能够使用以下带正则表达式的javascript函数split。 What regex would I pass to split?
我将通过什么正则表达式拆分? We are cross platform and would like to support current and previous versions (1 version back) of IE, FF, and Chrome if possible.
我们是跨平台的,如果可能的话,我们希望支持IE,FF和Chrome的当前和先前版本(支持1个新版本)。
Instead of a split, do a global match (the same way a lexical analyzer would): 而不是拆分,而是进行全局匹配(与词法分析器相同):
\\\\
or |
\\\\
或|
以外的任何内容 Something like this: 像这样:
var str = "1|2|3\\|4|5";
var matches = str.match(/([^\\|]|\\.)+/g);
A quick explanation: ([^\\\\|]|\\\\.)
matches either any character except '\\'
and '|'
快速说明:
([^\\\\|]|\\\\.)
匹配除'\\'
和'|'
以外'\\'
任何字符 (pattern: [^\\\\|]
) or (pattern: |
) it matches any escaped character (pattern: \\\\.
). (模式:
[^\\\\|]
)或(模式: |
),它与任何转义字符(模式: \\\\.
)匹配。 The +
after it tells it to match the previous once or more: the pattern ([^\\\\|]|\\\\.)
will therefor be matches once or more. 后面的
+
它与先前的匹配一次或多次:模式([^\\\\|]|\\\\.)
将因此匹配一次或多次。 The g
at the end of the regex literal tells the JavaScript regex engine to match the pattern globally instead of matching it just once. regex文字末尾的
g
告诉JavaScript regex引擎全局匹配模式,而不是只匹配一次。
What you're looking for is a "negative look-behind matching regular expression". 您正在寻找的是“负向后匹配正则表达式”。
This isn't pretty, but it should split the list for you: 这不是很漂亮,但是应该为您拆分列表:
var output = input.replace(/(\\)?|/g, function($0,$1){ return $1?$1:$0+'\n';});
This will take your input string and replace all of the '|' 这将使用您的输入字符串并替换所有的'|' characters NOT immediately preceded by a '\\' character and replace them with '\\n' characters.
字符后不立即带有“ \\”字符,并用“ \\ n”字符替换它们。
A regex solution was posted as I was looking into this. 正当我对此进行调查时,发布了一个正则表达式解决方案。 So I just went ahead and wrote one without it.
所以我只是继续写一个没有它的东西。 I did some simple benchmarks and it is -slightly- faster (I expected it to be slower...).
我做了一些简单的基准测试,它稍微快一些(我希望它会慢一点……)。
Without using Regex, if I understood what you desire, this should do the job: 如果不使用Regex,如果我了解您的需求,那么就可以完成此工作:
function doSplit(input) {
var output = [];
var currPos = 0,
prevPos = -1;
while ((currPos = input.indexOf('|', currPos + 1)) != -1) {
if (input[currPos-1] == "\\") continue;
var recollect = input.substr(prevPos + 1, currPos - prevPos - 1);
prevPos = currPos;
output.push(recollect);
}
var recollect = input.substr(prevPos + 1);
output.push(recollect);
return output;
}
doSplit('1|2|3\\|4|5'); //returns [ '1', '2', '3\\|4', '5' ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.