简体   繁体   English

RegEx:匹配引号中字符集的第二次出现

[英]RegEx: Match second occurrence of character set in in quotes

I'm looking to match the second ocurrence of a character set enclosed in quotes.我希望匹配用引号括起来的字符集的第二次出现。 For example:例如:

"{08165EA0-E946-11CF-9C87-00AA005127ED}"="WebCheckWebCrawler"

I only want to select WebCheckWebCrawler , not 08165EA0-E946-11CF-9C87-00AA005127ED .我只想选择WebCheckWebCrawler ,而不是08165EA0-E946-11CF-9C87-00AA005127ED

Here's a what I have so far but I'm unable to select the second occurrence.这是我到目前为止所拥有的,但我无法选择第二次出现。

https://regex101.com/r/Dr7ly2/4 https://regex101.com/r/Dr7ly2/4

Thanks for your help.谢谢你的帮助。

Generic solution通用解决方案

Getting nth occurrence of a match is in most cases achieved through extracting all, multiple matches from the string, and then getting the necessary item through its index.在大多数情况下,通过从字符串中提取所有多个匹配项,然后通过其索引获取必要的项来获取第 n 次匹配项。 Just a quick sample in Powershell:只是 Powershell 中的一个快速示例:

Select-String '"([^"]+)"' -input $str -AllMatches | % { $_.matches } |
    % { $_.groups[1].value } | select -Skip 1 -First 1

Here, Select-String '"([^"]+)"' -input $str -AllMatches | % { $_.matches } | % { $_.groups[1].value } gets all matches and collects all Group 1 values (substrings within double quotes excluding the quotes), and select -Skip 1 -First 1 omits the first item and gets the next one that becomes first. In other languages, there are similar ways. However , this takes a bit of code and is considered "expensive" since memory should be allocated for all the matches and their internal structures.这里, Select-String '"([^"]+)"' -input $str -AllMatches | % { $_.matches } | % { $_.groups[1].value }获取所有匹配项并收集所有 Group 1 个值(双引号内的子串,不包括引号),然后select -Skip 1 -First 1省略第一项并获取下一项成为第一项。在其他语言中,也有类似的方法。但是,这需要一些代码并且被认为是“昂贵的”,因为应该为所有匹配项及其内部结构分配内存。

A specific solution具体解决方案

In text editors, and in languages with no regex method that allows fetching multiple regex matches, the above solution does not work.在文本编辑器和没有允许获取多个正则表达式匹配项的正则表达式方法的语言中,上述解决方案不起作用。 In those cases, this kind of regex is used to get the second match:在这些情况下,这种正则表达式用于获取第二个匹配项:

^(?:.*?"([^"]*)"){2}
^(?:[^"]*"([^"]*)"){2}
^(?:.*?(<YOUR_PATTERN_HERE>)){2}

See the regex demo .请参阅正则表达式演示 Note : this requires a regex method that returns the whole match object structure with the captured substrings (submatches, captures).注意:这需要一个正则表达式方法,该方法返回带有捕获的子字符串(子匹配、捕获)的整个匹配对象结构。 Also, note that the .*?另外,请注意.*? does not match line break chars by default, and is slower than [^"]* that in its turn can match line break chars, and is faster.默认情况下不匹配换行符,并且比[^"]*慢,后者可以匹配换行符,并且速度更快。

Details :详情

  • ^ - start of string ^ - 字符串的开始
  • (?: - a non-capturing group start: (?: - 一个非捕获组开始:
    • .*? - any zero or more chars other than line break chars as few as possible - 尽可能少的除换行符以外的零个或多个字符
    • " - a " char " - 一个"字符
    • ([^"]*) - Capturing group 1: any zero or more ( * ) chars other than " ([^"]*) - 捕获组 1:除"之外的任何零个或多个 ( * ) 字符
    • " - a " char " - 一个"字符
  • ){2} - end of the group, repeat twice. ){2} - 组结束,重复两次。

What if you want the whole match?如果你想要整场比赛怎么办? This depends on the regex library.这取决于正则表达式库。 In Powershell, it is easy to get since .NET regex supports an infinite-width lookbehind patterns:在 Powershell 中,很容易获得,因为 .NET 正则表达式支持无限宽度的后视模式:

(?<=^(?:[^"]*"[^"]*"){1}[^"]*")[^"]*(?=")

See this regex demo .请参阅此正则表达式演示 Note that .*?请注意.*? is replaced with [^"]* to make sure no " could be matched in between "..." or the (?<=^(?:[^"]*"[^"]*") lookbehind would match many more strings here.被替换为[^"]* ,以确保没有"可以在两者之间匹配"..."(?<=^(?:[^"]*"[^"]*")回顾后会匹配多这里有更多的字符串。

Powershell code snippet: Powershell 代码片段:

Select-String '(?<=^(?:[^"]*"[^"]*"){1}[^"]*")[^"]*(?=")' -input $str | % { $_.matches.value }

In PCRE, you could use在 PCRE 中,您可以使用

^(?:.*?"[^"]*"){1}.*?"\K[^"]*(?=")

See the regex demo .请参阅正则表达式演示 The \\K omits the whole text matched so far from the overall memory match, so all that is returned is the portion of text matched with the last [^"]* ( (?=") is a positive lookahead whose pattern match is not added to the overall match as this is a non-consuming pattern). \\K从整体内存匹配中省略了到目前为止匹配的整个文本,因此返回的只是与最后一个[^"]*匹配的文本部分( (?=")是正前瞻,其模式匹配不是添加到整体匹配中,因为这是一个非消耗模式)。 It is good to use in PHP, R, Sublime Text (PCRE), Ruby (Onigmo), Notepad++ (Boost).在 PHP、R、Sublime Text (PCRE)、Ruby (Onigmo)、Notepad++ (Boost) 中使用是很好的。 Powershell does not support \\K , unfortunately.不幸的是,Powershell 不支持\\K

Current scenario solution当前场景解决方案

You do not need to use such a complex patterns.您不需要使用如此复杂的模式。 You may use您可以使用

="([^"]+)"

See the regex demo .请参阅正则表达式演示

Details细节

  • =" - a =" substring =" - a ="子串
  • ([^"]+) - Group 1 capturing 1 or more chars other than " ([^"]+) - 组 1 捕获 1 个或多个字符而不是"
  • " - a " . " -一个"

Grab the value inside Group 1, $matches[1] .获取 Group 1 中的值$matches[1]

In Powershell, the value you need can be obtained like this:在Powershell中,你需要的值可以这样获取:

PS> $str = '"{08165EA0-E946-11CF-9C87-00AA005127ED}"="WebCheckWebCrawler"';
PS> $pattern = '="([^"]+)"'
PS> $str -match $pattern
True
PS> $matches[1]
WebCheckWebCrawler
PS>

Try this.尝试这个。

 let str = '"{08165EA0-E946-11CF-9C87-00AA005127ED}"="WebCheckWebCrawler"'; console.log(str.match(/(?<=\\=)".*?"/g));

To capture you can use:要捕获,您可以使用:

="(.*)"

Check demo检查演示

You can also use: ="(.*?)"你也可以使用: ="(.*?)"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM