简体   繁体   中英

RegEx: Match second occurrence of character set in in quotes

I'm looking to match the second ocurrence of a character set enclosed in quotes. For example:

"{08165EA0-E946-11CF-9C87-00AA005127ED}"="WebCheckWebCrawler"

I only want to select WebCheckWebCrawler , not 08165EA0-E946-11CF-9C87-00AA005127ED .

Here's a what I have so far but I'm unable to select the second occurrence.

https://regex101.com/r/Dr7ly2/4

Thanks for your help.

Generic solution

Getting nth occurrence of a match is in most cases achieved through extracting all, multiple matches from the string, and then getting the necessary item through its index. Just a quick sample in Powershell:

Select-String '"([^"]+)"' -input $str -AllMatches | % { $_.matches } |
    % { $_.groups[1].value } | select -Skip 1 -First 1

Here, Select-String '"([^"]+)"' -input $str -AllMatches | % { $_.matches } | % { $_.groups[1].value } gets all matches and collects all Group 1 values (substrings within double quotes excluding the quotes), and select -Skip 1 -First 1 omits the first item and gets the next one that becomes first. In other languages, there are similar ways. However , this takes a bit of code and is considered "expensive" since memory should be allocated for all the matches and their internal structures.

A specific solution

In text editors, and in languages with no regex method that allows fetching multiple regex matches, the above solution does not work. In those cases, this kind of regex is used to get the second match:

^(?:.*?"([^"]*)"){2}
^(?:[^"]*"([^"]*)"){2}
^(?:.*?(<YOUR_PATTERN_HERE>)){2}

See the regex demo . Note : this requires a regex method that returns the whole match object structure with the captured substrings (submatches, captures). Also, note that the .*? does not match line break chars by default, and is slower than [^"]* that in its turn can match line break chars, and is faster.

Details :

  • ^ - start of string
  • (?: - a non-capturing group start:
    • .*? - any zero or more chars other than line break chars as few as possible
    • " - a " char
    • ([^"]*) - Capturing group 1: any zero or more ( * ) chars other than "
    • " - a " char
  • ){2} - end of the group, repeat twice.

What if you want the whole match? This depends on the regex library. In Powershell, it is easy to get since .NET regex supports an infinite-width lookbehind patterns:

(?<=^(?:[^"]*"[^"]*"){1}[^"]*")[^"]*(?=")

See this regex demo . Note that .*? is replaced with [^"]* to make sure no " could be matched in between "..." or the (?<=^(?:[^"]*"[^"]*") lookbehind would match many more strings here.

Powershell code snippet:

Select-String '(?<=^(?:[^"]*"[^"]*"){1}[^"]*")[^"]*(?=")' -input $str | % { $_.matches.value }

In PCRE, you could use

^(?:.*?"[^"]*"){1}.*?"\K[^"]*(?=")

See the regex demo . The \\K omits the whole text matched so far from the overall memory match, so all that is returned is the portion of text matched with the last [^"]* ( (?=") is a positive lookahead whose pattern match is not added to the overall match as this is a non-consuming pattern). It is good to use in PHP, R, Sublime Text (PCRE), Ruby (Onigmo), Notepad++ (Boost). Powershell does not support \\K , unfortunately.

Current scenario solution

You do not need to use such a complex patterns. You may use

="([^"]+)"

See the regex demo .

Details

  • =" - a =" substring
  • ([^"]+) - Group 1 capturing 1 or more chars other than "
  • " - a " .

Grab the value inside Group 1, $matches[1] .

In Powershell, the value you need can be obtained like this:

PS> $str = '"{08165EA0-E946-11CF-9C87-00AA005127ED}"="WebCheckWebCrawler"';
PS> $pattern = '="([^"]+)"'
PS> $str -match $pattern
True
PS> $matches[1]
WebCheckWebCrawler
PS>

Try this.

 let str = '"{08165EA0-E946-11CF-9C87-00AA005127ED}"="WebCheckWebCrawler"'; console.log(str.match(/(?<=\\=)".*?"/g));

To capture you can use:

="(.*)"

Check demo

You can also use: ="(.*?)"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM