[英]Splitting text in C# by tag
I am splitting string in my code like this: 我在我的代码中拆分字符串,如下所示:
var lines = myString == null
? new string[] { }
: myString.Split(new[] { "\n", "<br />" }, StringSplitOptions.RemoveEmptyEntries);
The trouble is this, sometimes the text looks like this: 问题是这样,有时文本看起来像这样:
sdjkgjkdgjk<br />asdfsdg
And in this case my code works. 在这种情况下,我的代码有效。 however, other times, the text looks like this: 但是,其他时候,文本看起来像这样:
sdjkgjkdgjk<br style="someAttribute: someProperty;"/>asdfsdg
And in this case, I don't get the result I want. 在这种情况下,我没有得到我想要的结果。 how to split this string by the whole br tag, along with its all attributes? 如何通过整个br标签拆分此字符串及其所有属性?
Use Regex.Split()
. 使用Regex.Split()
。 Below is an example:- 以下是一个例子: -
using System;
using System.Text.RegularExpressions;
public class Example
{
public static void Main()
{
string input = "sdjkgjkdgjk<br />asdfsdg";
string pattern = "<br.*\\/>"; // Split on <br/>
DisplayByRegex(input, pattern);
input = "sdjkgjkdgjk<br style=\"someAttribute: someProperty;\"/>asdfsdg";
DisplayByRegex(input, pattern);
Console.Read();
}
private static void DisplayByRegex(string input, string pattern)
{
string[] substrings = Regex.Split(input, pattern);
foreach (string match in substrings)
{
Console.WriteLine("'{0}'", match);
}
}
}
If you only need to split by br
tags and newline, regex is a good option: 如果你只需br
标签和换行符分割,正则表达式是一个不错的选择:
var lines = myString == null ?
new string[] { } :
Regex.Split(myString, "(<br.+>)|(\r\n?|\n)");
But if your requirements get more complex, I'd suggest using an HTML parser. 但如果您的要求变得更复杂,我建议使用HTML解析器。
你可以尝试这个:
var parts = Regex.Split(value, @"(<b>[\s\S]+?<\/b>)").Where(l => l != string.Empty).ToArray();
我希望以下代码可以帮助您。
var items = Regex.Split("sdjkgjkdgjk<br style='someAttribute: someProperty;'/>asdfsdg", @"<.*?>");
You shoul use a regular expression. 你应该使用正则表达式。 Here you can find a good tutorial for your purpose. 在这里,您可以找到适合您目的的好教程 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.