[英]Get HTML tag from a string
What is the best and cleanest way of getting a html tag from a string? 从字符串中获取html标记的最佳和最干净的方法是什么?
I have a string of html with several embed tags containing videos. 我有一串带有几个包含视频的嵌入标签的html。 There can be any number of embed tag in the html string
html字符串中可以有任意数量的embed标记
I guess i could do something like this but it can't be the best way: 我想我可以做这样的事情但它不是最好的方法:
string embedSrc = propertyText.Substring(propertyText.IndexOf("<embed"), (propertyText.IndexOf ("</embed") - propertyText.IndexOf("<embed") + 8));
Try using the HtmlAgilityPack to parse it easily. 尝试使用HtmlAgilityPack轻松解析它。 If not you could use a regular expression
如果没有,你可以使用正则表达式
I think you can use C# api for this. 我认为你可以使用C#api。 Try using XmlDocument's LoadXml(string) method.
尝试使用XmlDocument的LoadXml(字符串)方法。 After that just use the object operations to extract inner tags or texts from it.
之后,只需使用对象操作从中提取内部标签或文本。 Take a look at XmlDocument from MSDN
看看MSDN中的XmlDocument
Sebastian has the right of it, find a library and the HtmlAgilityPack is a great option. 塞巴斯蒂安有权利,找到一个图书馆和HtmlAgilityPack是一个很好的选择。 If you need the document structure, this is really the best option.
如果您需要文档结构,这确实是最佳选择。
Parsing with Regular Expressions is generally considered a no-no for HTML. 使用正则表达式进行解析通常被认为是HTML的禁忌。 It really depends on what you're trying to read out of the input string.
这实际上取决于您尝试从输入字符串中读取的内容。 I wrote a lightweight xml/html parser using Regex just to see it done.
我使用Regex编写了一个轻量级的xml / html解析器 ,看它已经完成了。 This can provide you with the Regex patterns needed.
这可以为您提供所需的正则表达式模式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.