简体   繁体   中英

extract specific data from HTML -CDATA- pattern in C#

I have problem while parsing data from XML Feed, The description node contains this line :

<![CDATA[<div><b>ID:</b> 40</div><div><b>Name:</b> John</div>]]>

How can i parse the ID & Name with values ?

you can do this using HtmlAgilityPAck and Regex as fallows:

  HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument(); string a = "<![CDATA[<div><b>ID:</b> 40</div><div><b>Name:</b> John</div>]]>"; doc.LoadHtml(Regex.Match(Regex.Match(a, @"\\[([^)]*)\\]").Groups[1].Value, @"\\[([^)]*)\\]").Groups[1].Value); var divs = doc.DocumentNode.SelectNodes("//div"); string ID = divs[0].InnerText.Split(':')[1]; string Name = divs[1].InnerText.Split(':')[1]; 

this works for me with your data example

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM