简体   繁体   中英

Javascript split text and regex

I am working with firefox under debian, and I don't understand the comportment of javascript:

var testRegex = /yolo .+ .+/gu;
let test = `yolo 2 abc
yolo 2 abc`;

test = test.split('\n');

for (let t=0; t < test.length; t++)
{
    console.log(test[t], testRegex.exec(test[t]));
}

And it send back:

控制台结果

Something even stranger:

for (let t=0; t < test.length; t++)
{
    console.log(test[t], testRegex.exec(test[t]), test[t].match(testRegex));
}

send back:

控制台结果

I don't think it could be an encoding problem, or from my code.

What can I do?

This is actually expected behaviour, believe it or not. The exec() method on a JavaScript regex is stateful and intended to be something that one would call within a loop. Each subsequent execution will return the next match within the string until no further matches are found, at which point null will be returned.

To highlight this in your first example, let's quickly simplify the code a bit and show what values are in each variable.

let testRegex = /yolo .+ .+/gu;
let test = [
  "yolo 2 abc",
  "yolo 2 abc"
]

This results in your calls to testRegex.exec looking something like the following:

testRegex.exec("yolo 2 abc") // => Array ["yolo 2 abc"]
testRegex.exec("yolo 2 abc") // => null

You'll find the official documentation for this here where they state:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property ( test() will also advance the lastIndex property). Note that the lastIndex property will not be reset when searching a different string, it will start its search at its existing lastIndex .

The reason why the second example you provide does not run into this issue is that the match() function resets the lastIndex property to 0 internally, resetting the search location and resulting in the second call to exec() searching from the start of the regular expression.

Coming back to your original example, you could modify it as follows and you would see the behaviour you're expecting:

var testRegex = /yolo .+ .+/gu;
let test = `yolo 2 abc
yolo 2 abc`;

test = test.split('\n');

for (let t=0; t < test.length; t++)
{
    console.log(test[t], testRegex.exec(test[t]));
    testRegex.lastIndex = 0;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM