简体   繁体   中英

Regex using Multiline and Groups

Hi guyes just had a quick question about using multi-line in regex:

The Regex:

 string content = Regex.Match(onix.Substring(startIndex,endIndex - startIndex), @">(.+)<", RegexOptions.Multiline).Groups[1].Value;

Here is the string of text I am reading:

    <Title>
         <TitleType>01</TitleType>
         <TitleText textcase="02">18th Century Embroidery Techniques</TitleText>
    </Title>

Here is what I am getting:

01

What I want is everything between the

 <Title> and </Title>.

This works perfectly when everything is on one line but since starts on another line it seems to be skipping it or not including it into the pattern.

Any assistance is much appreciated.

You must also use the Singleline option, along with Multiline:

string content = Regex.Match(onix.Substring(startIndex,endIndex - startIndex), @">(.+)<", RegexOptions.Multiline | RegexOptions.Singleline).Groups[1].Value;

But do yourself a favor and stop parsing XML using Regular Expressions! Use an XML parser instead!

You can parse the XML text using the XmlDocument class, and use XPath selectors to get to the element you're interested in:

XmlDocument doc = new XmlDocument();
doc.LoadXml(...);                              // your load the Xml text 

XmlNode root = doc.SelectSingleNode("Title");  // this selects the <Title>..</Title> element
                                               // modify the selector depending on your outer XML 
Console.WriteLine(root.InnerXml);              // displays the contents of the selected node

RegexOptions.Multiline will just change the meaning of ^ and $ to beginning/end of lines instead of beginning/end of the entire string.

You want to use RegexOptions.Singleline instead, which will result in . match line breaks (as well as everything else).

You might want to parse what is probably XML instead. If possible this is the preferred way of working instead of parsing it by employing regular expressions. Please disregard if not applicable.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM