简体   繁体   中英

Regex to extract string between quotes

I'm trying to extract a string between two quotes, and I thought I had my regex working, but it's giving me two strings in my GroupCollection, and I can't get it to ignore the first one, which includes the first quote and ID=

The string that I want to parse is

Test ID="12345" hello

I want to return 12345 in a group, so that I can manipulate it in code later. I've tried the following regex: http://regexr.com/3bgtl , with this code:

nodeValue = "Test ID=\"12345\" hello";
GroupCollection ids = Regex.Match(nodeValue, "ID=\"([^\"]*)").Groups;

The problem is that the GroupCollection contains two entries:

ID="12345

12345

I just want it to return the second one.

Use positive lookbehind operator:

GroupCollection ids = Regex.Match(nodeValue, "(?<=ID=\")[^\"]*").Groups;

You also used a capturing group (the parenthesis), this is why you get 2 results.

There are a few ways to accomplish this. I like named capture groups for readability.

Regex with named capture group:

"(?<capture>.*?)"

And your code would be:

match.Groups["capture"].Value

Your code is totally OK and is the most efficient from all the solutions suggested here. Capturing groups allow the quickest and least resource-consuming way to match substrings inside larger texts.

All you need to do with your regex is just access the captured group 1 that is defined by the round brackets. Like this:

var nodeValue = "Test ID=\"12345\" hello";
GroupCollection ids = Regex.Match(nodeValue, "ID=\"([^\"]*)").Groups;
Console.WriteLine(ids[1].Value);
// or just on one line
// Console.WriteLine(Regex.Match(nodeValue, "ID=\"([^\"]*)").Groups[1].Value);

See IDEONE demo

Please have a look at Grouping Constructs in Regular Expressions :

Grouping constructs delineate the subexpressions of a regular expression and capture the substrings of an input string. You can use grouping constructs to do the following:

  • Match a subexpression that is repeated in the input string.
  • Apply a quantifier to a subexpression that has multiple regular expression language elements. For more information about quantifiers, see [Quantifiers in Regular Expressions][3].
  • Include a subexpression in the string that is returned by the [Regex.Replace][4] and [Match.Result][5] methods.
  • Retrieve individual subexpressions from the [Match.Groups][6] property and process them separately from the matched text as a whole.

Note that if you do not need overlapping matches , capturing group mechanism is the best solution here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM