I would like to ask which Regex i can use in order to splits the text string by <math xmlns='http://www.w3.org/1998/Math/MathML'>....</math>
the the result will be:
the code is:
var text = @"{(test&<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><plus></plus><cn>1</cn><cn>2</cn></apply></math>)|(<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><root></root><degree><ci>m</ci></degree><ci>m</ci></apply></math>&nnm)&<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><power></power><cn>1</cn><cn>2</cn></apply></math>#<math xmlns='http://www.w3.org/1998/Math/MathML'><set><ci>l</ci></set></math>}";
string findTagString = "(<math.*?>)|(.+?(?=<math/>))";
Regex findTag = new Regex(findTagString);
List<string> textList = findTag.Split(text).ToList();
I have found a similar question at Using Regex to split XML string before and after match and i would like to ask for advice about the Regex expression
Thank you
Ori
经过一些测试,我认为这可以完成工作:
string findTagString = "(<math.*?></math>)|((.*){}()#&(.*))</math>";
Here is my attempt, based on a zero-length look-ahead and look-behind:
(?=<math[^>]*>)|(?<=</math>)
Code:
string findTagString = "(?=<math[^>]*>)|(?<=</math>)";
var text = @"{(test&<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><plus></plus><cn>1</cn><cn>2</cn></apply></math>)|(<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><root></root><degree><ci>m</ci></degree><ci>m</ci></apply></math>&nnm)&<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><power></power><cn>1</cn><cn>2</cn></apply></math>#<math xmlns='http://www.w3.org/1998/Math/MathML'><set><ci>l</ci></set></math>}";
Regex findTag = new Regex(findTagString);
string[] textList = findTag.Split(text);
Console.WriteLine(string.Join("\n", textList));
Output of a sample program :
{(test&
<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><plus></plus><cn>1</cn><cn>2</cn></apply></math>
)|(
<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><root></root><degree><ci>m</ci></degree><ci>m</ci></apply></math>
&nnm)&
<math xmlns='http://www.w3.org/1998/Math/MathML'><apply><power></power><cn>1</cn><cn>2</cn></apply></math>
#
<math xmlns='http://www.w3.org/1998/Math/MathML'><set><ci>l</ci></set></math>
}
I would advise against trying to use regular expressions with XML. XML is not a regular language and thus not fitting for regular expressions. Anyway .NET gives such convenient tools for parsing XML that I really don't see the point.
My suggestion is that you use LINQ to XML instead of regexs.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.