简体   繁体   中英

In Regular Expression, if the first parenthesis doesn't match, can $1 be empty string when using replace() in JavaScript vs gsub in Ruby?

Using JavaScript, to extract the prefix foo. including the . from foo.bar , I could use:

> "foo.bar".replace(/(\w+.)(.*)/, "$1")
"foo."

But if there is no such prefix, I'd expect it to give an empty string or null, but instead it gives the full string:

> "foobar".replace(/(\w+.)(.*)/, "$1")
"foobar"

Why will $1 give the whole string? -- as I thought it means the first parenthesis.

  1. Maybe it means the first parenthesis that actually matched?
  2. If #1 is true, then maybe a common, standard technique is to use ? , which works in Ruby:

    using irb:

     > "foo.bar".gsub(/(\\w+\\.)?(.*)/, '\\1') "foo." > "foobar".gsub(/(\\w+\\.)?(.*)/, '\\1') "" 

    Because the ? is optional, and it will match anyway. However, it doesn't work in JavaScript:

     > "foobar".replace(/(\\w+.)?(.*)/, "$1") "foobar" 

    I can use match() in JavaScript to do it, and it will be quite clean, but just for the sake of understanding replace() more:

  3. What is the reason that it works differently in Ruby vs JavaScript, and do #1 and #2 above also apply and/or what is a good alternative way to "grab" the prefix or get "" if it doesn't exist using replace() ?

FYI, I think your JavaScript's regex isn't correct since it doesn't escape the . (dot) character.

The reason why $1 returns the whole string is $1 tricked you to believe that it matches the first group (which isn't true).

/* your js regex is /(\w+.)/, I use /(\w+\.)/ instead to demonstrate it */
"foobar".replace(/(\w+\.)/, "$1"); // 'foobar'

It's because $1 matches nothing which is (empty) then the regex tries to replace the original string foobar with $1 (since it doesn't match anything it just returns the whole original string. To make it clears take a look at following example.

"foobar".replace(/(\w+\.)/, '-');    // 'foobar' (No matches, so nothing get replaced)
"foobar".replace(/(\w+\.)/, '$1');   // 'foobar' (No matches, $1 is empty, nothing get replaced)
"foobar.a".replace(/(\w+\.)/, '-');  // '-a' (matches 'foobar.' so replaces 'foobar.' with '-') + ('a')
"foobar.a".replace(/(\w+\.)/, '$1'); // 'foobar.a' (matches 'foobar.' so replaces 'foobar.' with itself) + ('a')

The replace method in JavaScript gives you a copy of the original string whether you successfully altered it or not.

So for instance:

alert( "atari.teenageRiot".replace(/5/,'reverse polarity of the neutron flow') );
//"atari.teenageRiot"

Replace isn't about finding matches. It's about altering the string by replacing what you match with the first argument with the second argument so you always return the string that's meant to be changed whether it was changed or not.

Also, I would use this instead:

"foo.bar".replace(/(\w+\.)(.*)/, "$1")

You didn' have the \\ before . so it was treated as the wild card which matches most characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM