简体   繁体   中英

C# - Best way to parse xml like text and perform action

I have a small text string with xml like tags inside it:

<sub>A</sub>B<sup>C</sup>

I need to parse this text and perform actions based on the tags. So the above text will look like A B C in my target application (MS Excel -- Excel can parse and format this string if I paste it but not if I just enter it in a cell).

What is the best way to parse this type of tag based text in terms of performance. The formatting code is going to be called very frequently and I want to minimize the overhead as much as possible. I can think of the following options:

  1. Parse it character by character using the Indexer keeping track of when the tag started/ended
  2. Use Regular Expressions
  3. Load it into some XML/HTML DOM Parser and iterate through the nodes

Which one do you think will have the least performance impact? Any other way I can get the task done?

Do not re-invent the wheel, and especially do not use regular expressions .

Use an existing XML parser.
You should use LINQ to XML.

If you implement that and find it too slow, you can switch to an XmlReader , which will be extremely fast but annoying to work with.
Remember; premature optimization is the root of all evil.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM