简体   繁体   中英

how to parse a string with html tags in its substrings which are bold, italic, underlined

I created some kind of text rendering tool for a 2D graphics framework in c#.

Now i was trying to parse a text with specific html tags in it, like:

"Hello <b>world</b>!" 

But the parsing code was getting ugly and I thought, there must be some lib that does exactly that. At the end it should output an array of data structures like:

string text;
bool IsBold;
bool IsItalic;
bool IsUnderlined;
...

or

string text;
FontStyle FontStyle;

Anyone know of such a parser?

Thanks a lot!

The HTML Agility Pack is a good HTML parser (and also parses fragments).

You can query it using XPath syntax (it is similar to XmlDocument) - not sure how good a fit it will be for your requirements.

I do not know how this would work, but here are some HTML parsers:
html_parse
htmlagilitypack

Tidy.net is a fantastic tool which is a port from the original Tidy project which is used in the HTML Tidy firefox plugin. Run your code through Tidy and it will return clean, compliant html.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM