簡體 English 中英

通過C＃，XmlDocument.LoadXml解析網頁

[英]Parsing web pages via C#, XmlDocument.LoadXml

原文 2011-12-16 18:57:30 3 1 c#/ parsing/ web/ xmldocument

我正在嘗試下載網頁並對其進行解析。 我需要到達html文檔的每個節點。 因此，我使用WebClient下載，效果很好。 然后，我使用以下代碼段來解析文檔：

 WebClient client = new WebClient();

 Stream data = client.OpenRead("http://web.cs.hacettepe.edu.tr/~bil339/");
 StreamReader reader = new StreamReader(data);
 string xml = reader.ReadToEnd();

 data.Close();
 reader.Close();
 XmlDocument xmlDoc = new XmlDocument();
 xmlDoc.loadXml(xml);

在最后一行，程序等待一段時間，然后崩潰。 它說HTML代碼中有錯誤，這是不希望的，不應該在這里，等等。是否有任何建議可解決此問題？ 歡迎使用其他解析HTML代碼的技術（當然，在C＃中）。

1 個解決方案

使用HTMLAgilityPack解析HTML。 格式正確的HTML並非XML，因此無法進行解析。 例如，它缺少所有XML文件都需要的<?xml version="1.0" encoding="UTF-8"?>前言。 HTML Agility Pack更寬容。

C＃XmlDocument.LoadXml和通配符

[英]C# XmlDocument.LoadXml And Wildcards

C＃XmlDocument.LoadXml（string）失敗-根級別的數據無效。第1行的位置1。

[英]C# XmlDocument.LoadXml(string) fail -data at the root level is invalid. line 1 position 1. xmldocument

為什么包含XML頭時C＃XmlDocument.LoadXml（字符串）會失敗？

[英]Why does C# XmlDocument.LoadXml(string) fail when an XML header is included?

XmlDocument.Load Vs XmlDocument.LoadXml

[英]XmlDocument.Load Vs XmlDocument.LoadXml

XmlDocument.LoadXml（）和XML聲明編碼屬性

[英]XmlDocument.LoadXml() and XML declaration encoding attribute

為什么`XmlDocument.LoadXml（）`無法與名稱空間一起使用？

[英]Why does `XmlDocument.LoadXml()` not work with namespace?

是否有來自.NET的XmlDocument.LoadXml（）的Java等價物？

[英]Is there a Java equivalent for XmlDocument.LoadXml() from .NET?

XmlDocument.LoadXML名稱不能以“ <”字符開頭

[英]XmlDocument.LoadXML Name cannot begin with the '<' character

XmlDocument.LoadXml（）拋出ComException類型的異常

[英]XmlDocument.LoadXml() throws an exception of type ComException

不使用XmlDocument.Loadxml（）函數將XML反序列化為JSON

[英]Deserializing XML into JSON without using XmlDocument.Loadxml() function

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 C＃XmlDocument.LoadXml和通配符 C＃XmlDocument.LoadXml（string）失敗-根級別的數據無效。第1行的位置1。為什么包含XML頭時C＃XmlDocument.LoadXml（字符串）會失敗？ XmlDocument.Load Vs XmlDocument.LoadXml XmlDocument.LoadXml（）和XML聲明編碼屬性為什么`XmlDocument.LoadXml（）`無法與名稱空間一起使用？是否有來自.NET的XmlDocument.LoadXml（）的Java等價物？ XmlDocument.LoadXML名稱不能以“ <”字符開頭 XmlDocument.LoadXml（）拋出ComException類型的異常不使用XmlDocument.Loadxml（）函數將XML反序列化為JSON

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM