简体   繁体   中英

javascript regex capturing parentheses

I don't really get the concept on capturing parentheses when dealing with javascript regex. I don't understand why we need parentheses for the following example

var x = "{xxx} blah blah blah {yyy} and {111}";
x.replace( /{([^{}]*)}/g , 
          function(match,content) {
               console.log(match,content);
               return "whatever";
});

//it will print
{xxx} xxx
{yyy} yyy
{111} 111

so when i drop the parentheses from my pattern x the results give a different value

x.replace( /{[^{}]*}/g , 
          function(match,content) {
               console.log(match,content);
               return "whatever";
});

//it will print
{xxx} 0
{yyy} 37
{111} 49

so the content values now become numeric value which i have no idea why. Can someone explains what's going on behind the scene ?

According to the MDN documentation , the parameters to the function will be, in order:

  • The matched substring.
  • Any groups that are defined, if there are any.
  • The index in the original string where the match was found.
  • The original string.

So in the first example, content will be the string which was captured in group 1. But when you remove the group in the second example, content is actually the index where the match was found.

This is useful with replacement of texts.

For example, I have this string "one two three four" that I want to reverse like "four three two one". To achieve that I will use this line of code:

var reversed = "one two three four".replace(/(one) (two) (three) (four)/, "$4 $3 $2 $1");

Note how $n represents each word in the string.

Another example: I have the same string "one two three four" and I want to print each word twice:

var eachWordTwice = "one two three four".replace(/(one) (two) (three) (four)/, "$1 $1 $2 $2 $3 $3 $4 $4");

The numbers:

The offset of the matched substring within the total string being examined. (For example, if the total string was "abcd", and the matched substring was "bc", then this argument will be 1.)

Source:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace

"Specifying a function as a parameter" section

Parenthesis are used to capture/replace only a portion of the match. For instance, when I use it to match phone numbers that may or may not have extensions. This function matches the whole string (if the if is correct), so the entire string is replaced, but I am only using a specific types of characters in a specific order, with whitespace or other("() -x") characters allowed in the input.

It will always output a string formatted to (651) 258-9631 x1234 if given 6512589631x1234 or 1 651 258 9631 1234. It also doesn't allow (or in this case format) toll-free numbers as they aren't allowed in my field.

function phoneNumber(v) {
// take in a string, return a formatted string (651) 651-6511 x1234
if (v.search(/^[1]{0,1}[-(\s.]{0,1}(?!800|888|877|866|855|900)([2-9][0-9]{2})[-)\s.]{0,2}([2-9][0-9]{2})[-.\s]{0,2}([0-9]{4})[\s]*[x]{0,1}([0-9]{1,5}){1}$/gi) !== -1) {return v.replace(/^[1]{0,1}[-(\s.]{0,1}(?!800|888|877|866|855|900)([2-9][0-9]{2})[-)\s.]{0,2}([2-9][0-9]{2})[-.\s]{0,2}([0-9]{4})[\s]*[x]{0,1}([0-9]{1,5}){1}$/gi,"($1) $2-$3 x$4"); }
if (v.search(/^[1]{0,1}[-(\s.]{0,1}(?!800|888|877|866|855|900)([2-9][0-9]{2})[-)\s.]{0,1}([2-9][0-9]{2})[-.\s]{0,2}([0-9]{4})$/gi) !== -1) { return v.replace(/^[1]{0,1}[-(\s.]{0,1}(?!800|888|877|866|855|900)([2-9][0-9]{2})[-)\s.]{0,1}([2-9][0-9]{2})[-.\s]{0,2}([0-9]{4})$/gi,"($1) $2-$3"); }
return v;
}

What this allows me to do is gather the area code, prefix, line number, and an optional extension, and format it the way I need it (for users who can't follow directions, for instance).

So it you input 6516516511x1234 or "(651) 651-6511 x1234", it will match one regex or another in this example.

Now what is happening in your code is as @amine-hajyoussef said - The index of the start of each match is being returned. Your use of that code would be better serviced by match for example one (text returned), or search for the index, as in example two. pswg's answer expands.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM