简体   繁体   English

使用JS split函数的RegEx的不同结果-结果中为空字符串

[英]Different results of RegEx using JS split function - empty strings in result

I'm trying to split a string using a regular expression and split function in JavaScript. 我正在尝试使用正则表达式拆分字符串并在JavaScript中拆分函数。 For example, I have a string: olej sojowy, sorbitol, czerwień koszenilową and my RegEx is: 例如,我有一个字符串: olej sojowy, sorbitol, czerwień koszenilową而我的RegEx是:

/, (?!(któ))/g

When I test it here: http://regexr.com/38ps8 I get 2 matches, as expected, so in result I should get 3 elements after split. 当我在这里进行测试时: http : //regexr.com/38ps8,我得到了2个匹配,正如预期的那样,因此结果是分割后我应该得到3个元素。

But when I try to use this expression in split function: 但是当我尝试在split函数中使用此表达式时:

var parts="olej sojowy, sorbitol, czerwień koszenilową".split(/, (?!(któ))/g);
console.log("Num of elements:" + parts.length); 
console.log(parts.join("!\n!"));

the result is different and it returns 5 elements in an array, with two additional empty strings: 结果是不同的,它返回一个数组中的5个元素,以及两个附加的空字符串:

Num of elements:5 
olej sojowy!
!!
!sorbitol!
!!
!czerwień koszenilową 

Why isn't it working as expected? 为什么它没有按预期工作? Is it a problem with split function? 拆分功能有问题吗? Does it use a regular expression in a different way than I would expect? 它使用正则表达式的方式与我期望的方式不同吗?

Edit: I've just also noticed that if I change my Regular expression to /, /g then I get just what I wanted (3 elements in result), but there are other strings which I don't want to split if there is któ after the coma and space. 编辑:我也注意到,如果我将正则表达式更改为/,/ g,那么我得到的正是我想要的(结果为3个元素),但是如果有其他字符串,我也不想拆分昏迷和空间后的któ。 So why is this operator changing a behaviour of split? 那么,为什么该运算符会更改split的行为?

It's working exactly as it should. 它完全可以正常工作。 You've used , as the delimiter so it gives you five elements: 您已经使用,作为分隔符所以它给你五个要素:

[1] olej sojowy
[2]   
[3] sorbitol
[4]   
[5] czerwień koszenilową

The empty elements are indicators of where the split(s) are located. 空元素是拆分位置的指示符。

From Mozilla's JS ref : Mozilla的JS ref

If separator is a regular expression that contains capturing parentheses, then each time separator is matched, the results (including any undefined results) of the capturing parentheses are spliced into the output array. 如果分隔符是包含捕获括号的正则表达式,则每次匹配分隔符时,捕获括号的结果(包括任何未定义的结果)都会被拼接到输出数组中。 However, not all browsers support this capability. 但是,并非所有浏览器都支持此功能。

If the regex in split contains capturing groups , the contents of each group is inserted in the result as well. 如果split中的正则表达式包含捕获组 ,则每个组的内容也将插入到结果中。 Since you have a capturing group (któ) , that is what you get. 由于您有一个捕获组(któ) ,因此您将得到。 It is empty because (?!(któ)) is empty. 它是空的,因为(?!(któ))是空的。 If you add the text , któ anywhere inside your string, you will see it appear: 如果在字符串中的任意位置添加文本, któ ,则会看到它出现:

var parts="olej sojowy, któ sorbitol, czerwień koszenilową".split(/, (?!(któ))/g);

shows 3 elements. 显示3个元素。 The 2nd is, quite surprising, just ", " . 令人惊讶的是,第二个只是", " Then again, it is the one where któ follows (not sure how I can "prove" that"). 再说一次,它 któ遵循的któ (不确定我如何“证明”那个)。

If you omit the parentheses inside your lookahead, it works as you expect it to: 如果省略前瞻括号,它将按您期望的那样工作:

var parts="olej sojowy, któ sorbitol, czerwień koszenilową".split(/, (?!któ)/g);

No capturing groups so you get only the remaining text after removal of the matching regex. 没有捕获组,因此在删除匹配的正则表达式后,您仅获得其余文本。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM