简体   繁体   中英

Regex & JS how to regex match a string between " and partial patterns (match first few characters and no special pattern)

I been been struggled to make this regex and would love to get some help here.

So I want to match an url string if it is

  • between ""
  • start with "https://example.com
  • no space, tab, newline within the ""
  • not contain patterns like .dont_match1 or .dont_match1/ at the end

then replace example.com to example2.com .

for example,

bla ...... "https://example.com/content/a.dont_match1" 
bla ...... "https://example.com/content/a.dont_match2" 

No match

href="https://example.com/"    

Matched and replace to =>href="https://example2.com/"

<link rel="canonical" href="https://example.com adasd /" />

No match because of the stupid space

<link rel="manifest" href="https://example.com/a/asd/aaaa">

Matched and replace to =><link rel="manifest" href="https://example2.com/a/asd/aaaa">

All these lines are in a file.

Been stuck on these for a while, have tried quite a few, but not working well

  • (=".*)(example.com)([^\\s])*"
  • (=".*)(example.com)([^\\s|^.dont_match1 |^.dont_match2])*"

You can use

/("https:\/\/)example\.com(?![^\s"]*\.(?:dont_match1|dont_match2)\/?")([^\s"]*")/g

Repace with $1example2.com$2 . See the regex demo .

Details

  • ("https:\\/\\/) - Group 1 ( $1 ): "https:// string
  • example\\.com - an example.com string
  • (?![^\\s"]*\\.(?:dont_match1|dont_match2)\\/?") - a negtive lookahead that fails the match if there are zero or more chars other than whitespace and " followed with a . , then either dont_match1 or dont_match2 , then an optional / and then a " immediately to the right of the current location
  • ([^\\s"]*") - Group 2 ( $2 ): zero or more chars other than whitespace and " and then a " char.

JavaScript demo:

 const array = ['bla ...... "https://example.com/content/a.dont_match1"', 'bla ...... "https://example.com/content/a.dont_match2"', 'href="https://example.com/" ']; const rx = /("https:\\/\\/)example\\.com(?![^\\s"]*\\.(?:dont_match1|dont_match2)\\/?")([^\\s"]*")/g; array.forEach( x => console.log(x, '=>', x.replace(rx, '$1example2.com$2')) )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM