Hi guyes just had a quick question about using multi-line in regex:
The Regex:
string content = Regex.Match(onix.Substring(startIndex,endIndex - startIndex), @">(.+)<", RegexOptions.Multiline).Groups[1].Value;
Here is the string of text I am reading:
<Title>
<TitleType>01</TitleType>
<TitleText textcase="02">18th Century Embroidery Techniques</TitleText>
</Title>
Here is what I am getting:
01
What I want is everything between the
<Title> and </Title>.
This works perfectly when everything is on one line but since starts on another line it seems to be skipping it or not including it into the pattern.
Any assistance is much appreciated.
You must also use the Singleline option, along with Multiline:
string content = Regex.Match(onix.Substring(startIndex,endIndex - startIndex), @">(.+)<", RegexOptions.Multiline | RegexOptions.Singleline).Groups[1].Value;
But do yourself a favor and stop parsing XML using Regular Expressions! Use an XML parser instead!
You can parse the XML text using the XmlDocument class, and use XPath selectors to get to the element you're interested in:
XmlDocument doc = new XmlDocument();
doc.LoadXml(...); // your load the Xml text
XmlNode root = doc.SelectSingleNode("Title"); // this selects the <Title>..</Title> element
// modify the selector depending on your outer XML
Console.WriteLine(root.InnerXml); // displays the contents of the selected node
RegexOptions.Multiline
will just change the meaning of ^
and $
to beginning/end of lines instead of beginning/end of the entire string.
You want to use RegexOptions.Singleline
instead, which will result in .
match line breaks (as well as everything else).
You might want to parse what is probably XML instead. If possible this is the preferred way of working instead of parsing it by employing regular expressions. Please disregard if not applicable.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.