[英]C# Trying to read a page using XmlNode
So I am trying to read the Steam store page from the lowest price to the highest. 因此,我正在尝试从最低价格到最高价格阅读Steam商店页面。 I have the URL needed and I have written some code which have worked in the past but does not work anymore.
我有所需的URL,并且编写了一些过去可以使用但不再可用的代码。 I have spend some days trying to fix this problem but I just can't seem to find the problem.
我花了几天的时间来解决此问题,但似乎无法找到问题。
Link I am trying to read. 链接我正在尝试阅读。
Here is the code. 这是代码。
//List of items from the Steam market from lowest to highest
private void priceFromMarket(int StartPage)
{
if (valueList.Count != 0)
{
valueList.Clear();
numList.Clear();
nameList.Clear();
}
string pageContent = null;
string results_html = null;
try
{
HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://steamcommunity.com/market/search/render/?query=appid:730&start=" + StartPage.ToString() + "&sort_column=price&sort_dir=asc&count=100¤cy=1&l=english");
HttpWebResponse myRes = (HttpWebResponse)myReq.GetResponse();
using (StreamReader sr = new StreamReader(myRes.GetResponseStream()))
{
pageContent = sr.ReadToEnd();
}
}
catch { Thread.Sleep(30000); priceFromMarket(StartPage); }
if (pageContent == null) { priceFromMarket(StartPage); }
try
{
JObject user = JObject.Parse(pageContent);
bool success = (bool)user["success"];
if (success)
{
results_html = (string)user["results_html"];
string data = results_html;
data = "<root>" + data + "</root>";
XmlDocument document = new XmlDocument();
document.LoadXml(System.Net.WebUtility.HtmlDecode(data));
XmlNode rootnode = document.SelectSingleNode("root");
XmlNodeList items = rootnode.SelectNodes("./a/div");
foreach (XmlNode node in items)
{
//This does not work anymore!
//The try fails here at line 574!
string value = node.SelectSingleNode("./div[contains(concat(' ', @class, ' '), ' market_listing_their_price ')]/span/span").InnerText;
string num = node.SelectSingleNode("./div[contains(concat(' ', @class, ' '), ' market_listing_num_listings ')]/span/span").InnerText;
string name = node.SelectSingleNode("./div/span[contains(concat(' ', @class, ' '), ' market_listing_item_name ')]").InnerText;
valueList.Add(value); //Lowest price for the item
numList.Add(num); //Volume of that item
nameList.Add(name); //Name of that item
}
}
else { Thread.Sleep(60000); priceFromMarket(StartPage); }
}
catch { Thread.Sleep(60000); priceFromMarket(StartPage); }
}
It's never reliable to parse HTML as XML because HTML doesn't have to be well formatted to be parsed properly... 将HTML解析为XML永远都不可靠,因为HTML的格式必须正确才能正确解析...
For parsing HTML in C# i prefer to use CSQuery https://www.nuget.org/packages/CsQuery/ 为了在C#中解析HTML,我更喜欢使用CSQuery https://www.nuget.org/packages/CsQuery/
it lets you parse HTML in c# similar to doing it via jquery. 它使您可以通过c#解析HTML,类似于通过jquery解析HTML。
Another way is HTML Agility Pack which you could probably use without changing much of your code.. it's functions are similar to the System.Xml.XmlDocument Library. 另一种方法是HTML Agility Pack,您可以在不更改大量代码的情况下使用它。其功能类似于System.Xml.XmlDocument库。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.