[英]how do you serialize HTML in C#?
how do you serialize HTML in C#? 如何在C#中序列化HTML?
I think I know how to use XSD.exe to create C# classes from XML that can be used with the XmlSerializer class to serialize and verify the XML document. 我想我知道如何使用XSD.exe从XML创建C#类,该类可以与XmlSerializer类一起使用以序列化和验证XML文档。
Is there a way to do the same sort of thing with an HTML document? 有没有办法用HTML文档做同样的事情? I have tried but the xsd command line says that the remote name www.w3.org cannot be resolved. 我已经尝试过,但是xsd命令行说无法解析远程名称www.w3.org。
At a minimum, is there a way to use C# to find out if an HTML file is valid? 至少,有没有一种方法可以使用C#来确定HTML文件是否有效?
The HTMLAgilityPack is an open source library that parses HTML easily for you. HTMLAgilityPack是一个开源库,可以为您轻松解析HTML。 You can then search/manipulate the structure of the document quite easily. 然后,您可以非常轻松地搜索/操作文档的结构。
It's quite forgiving with the HTML you provide it, so I'm not sure if it's a good way of checking that if you've got a strict xHTML valid document. 您提供的HTML非常宽容,因此我不确定这是否是检查您是否拥有严格的xHTML有效文档的好方法。 But it should be able to parse anything a modern browser can. 但是它应该能够解析现代浏览器可以执行的任何操作。
If it's XHTML that you're trying to validate, you can do it like this: 如果您要验证的是XHTML,则可以这样操作:
static void validate(string filename)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.ProhibitDtd = false;
settings.ValidationType = ValidationType.DTD;
settings.ValidationEventHandler +=
new ValidationEventHandler(ValidationCallBack);
settings.XmlResolver = new XhtmlUrlResolver();
// Create the XmlReader object.
XmlReader reader = XmlReader.Create(filename, settings);
// Parse the file.
while (reader.Read()) ;
}
// Display any validation errors.
private static void ValidationCallBack(object sender, ValidationEventArgs e)
{
Console.WriteLine("Validation Error: {0}", e.Message);
}
It will be a bit slow because it's downloading the schema files from the W3C web site. 这会有点慢,因为它是从W3C网站下载架构文件的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.