I want to query a string (html) from a database and display it on a webpage. The problem is that the data has a
<p> around the text (ending with </p>
I want to strip this outer tag in my viewmodel or controlleraction that returns this data. what is the best way of doing this in C#?
Might be overkill for your needs, but if you want to parse the HTML you can use the HtmlAgilityPack - certainly a cleaner solution in general than most suggested here, although it might not be as performant:
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<p> around the text (ending with </p>");
string result = doc.DocumentNode.FirstChild.InnerHtml;
If you're absolutely sure the string will always have that tag, you can use String.Substring like myString.Substring(3, myString.Length-7)
or so.
A more robust method would be to either manually code the appropriate tests or use a regular expression, or ultimately, use an HTML parser as suggested by BrokenGlass's answer .
UPDATE : Using regexes you could do:
String filteredString = Regex.Match(myString, "^<p>(.*)</p>").ToString();
You could add \\s after the initial ^ to remove also leading whitespace. Also, you can check the result of Match to see if the string matched the <p>...</p>
pattern at all. This may also help.
如果数据总是被<p>
... </p>
包围:
string withoutParas = withParas.Substring(3, withParas.Length - 7);
尝试使用字符串函数Remove()传递<p>
的FirstIndex()和</p>
的最后一个索引,长度为3
If you are absolutely guaranteed that you string will always fit the pattern of <p>...</p>
, then the other solutions using data.Substring(3, data.Length - 6)
are sufficient. If, however, there's any chance that it could look at all different , then you really need to use an HTML parser. The consensus is that the HTML Agility Pack is the way to go.
s = s.Replace("<p>", String.Empty).Replace("</p>", String.Empty);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.