简体   繁体   中英

Javascript Match and RegExp Issue — Strange Behavior

I have been trying to use a simple jQuery operation to dynamically match and store all anchor tags and their texts on the page. But I have found a weird behavior. When you are using match() or exec(), if you designate the needle as a separate RegExp object or a pattern variable, then your query matches only one instance among dozens in the haystack.

And if you designate the pattern like this

match(/needle/gi) 

then it matches every instance of the needle.

Here is my code.

You can even fire up Firebug and try this code right here on this page.

var a = {'text':'','parent':[]}; 

$("a").each(function(i,n) {

    var module = $.trim($(n).text());
    a.text += module.toLowerCase() + ',' + i + ','; 

    a.parent.push($(n).parent().parent()); 

});

var stringLowerCase = 'b';

var regex = new RegExp(stringLowerCase, "gi");
//console.log(a.text);
console.log("regex 1: ", regex.exec(a.text));

var regex2 = "/" + stringLowerCase + "/";
console.log("regex 2: ", a.text.match(regex2));

console.log("regex 3: ", a.text.match(/b/gi));

For me it is returning:

regex 1:  ["b"]
regex 2: null
regex 3: ["b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b", "b"]

Can anyone explain the root of this behavior?

EDIT : I forgot to mention that for regex1, it doesn't make any difference whether you add the flags "gi" for global and case insensitive matching. It still returns only one match.

EDIT2 : SOlved my own problem. I still don't know why one regex1 matches only one instance, but I managed to match all instances using the match() and the regex1.

So..this matches all and dynamically!

var regex = new RegExp(stringLowerCase, "gi");
console.log("regex 2: ", a.text.match(regex));

This is not unusual behaviour at all. In regex 1 you are only checking for 1 instance of it where in regex 3 you have told it to return all instances of the item by using the /gi argument.

In Regex 2 you are assuming that "/b/" === /b/ when it doesn't. "/b/" !== /b/. "/b/" is a string that is searching so if you string has "/b/" in it then it will return while /b/ means that it needs to search between the slashes so you could have "abc" and it will return "b"

I hope that helps.

EDIT:

Looking into it a little bit more, the exec methods returns the first match that it finds rather than all the matches that it finds.

EDIT:

var myRe = /ab*/g;
var str = "abbcdefabh";
var myArray;
while ((myArray = myRe.exec(str)) != null)
{
  var msg = "Found " + myArray[0] + ".  ";
  msg += "Next match starts at " + myRe.lastIndex;
  console.log(msg);
}

Having a look at it again it definitely does return the first instance that it finds. If you looped through it then would return more.

Why it does this? I have no idea...my JavaScript Kung Fu clearly isnt strong enough to answer that part

The reason regex 2 is returning null is that you're passing "/b/" as the pattern parameter, while "b" is actually the only thing that is actually part of the pattern. The slashes are shorthand for regex, just as [ ] is for array. So if you were to replace that to just new regex("b"), you'd get one match, but only one, since you're omitting the "global+ignorecase" flags in that example. To get the same results for #2 and #3, modify accordingly:

var regex2 = stringLowerCase;
console.log("regex 2: ", a.text.match(regex2, "gi"));
console.log("regex 3: ", a.text.match(/b/gi));

regex2 is a string, not a RegExp, I had trouble too using this kind of syntax, tho i'm not really sure of the behavior.

Edit : Remebered : for regex2, JS looks for "/b/" as a needle, not "b".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM