C＃尝试使用XmlNode阅读页面

Question

So I am trying to read the Steam store page from the lowest price to the highest. 因此，我正在尝试从最低价格到最高价格阅读Steam商店页面。 I have the URL needed and I have written some code which have worked in the past but does not work anymore. 我有所需的URL，并且编写了一些过去可以使用但不再可用的代码。 I have spend some days trying to fix this problem but I just can't seem to find the problem. 我花了几天的时间来解决此问题，但似乎无法找到问题。

Link I am trying to read. 链接我正在尝试阅读。

Here is the code. 这是代码。

    //List of items from the Steam market from lowest to highest
    private void priceFromMarket(int StartPage)
    {
        if (valueList.Count != 0)
        {
            valueList.Clear();
            numList.Clear();
            nameList.Clear();
        }
        string pageContent = null;
        string results_html = null;
        try
        {
            HttpWebRequest myReq = (HttpWebRequest)WebRequest.Create("http://steamcommunity.com/market/search/render/?query=appid:730&start=" + StartPage.ToString() + "&sort_column=price&sort_dir=asc&count=100&currency=1&l=english");
            HttpWebResponse myRes = (HttpWebResponse)myReq.GetResponse();
            using (StreamReader sr = new StreamReader(myRes.GetResponseStream()))
            {
                pageContent = sr.ReadToEnd();
            }
        }
        catch { Thread.Sleep(30000); priceFromMarket(StartPage); }
        if (pageContent == null) { priceFromMarket(StartPage); }
        try
        {
            JObject user = JObject.Parse(pageContent);
            bool success = (bool)user["success"];
            if (success)
            {
                results_html = (string)user["results_html"];
                string data = results_html;
                data = "<root>" + data + "</root>";
                XmlDocument document = new XmlDocument();
                document.LoadXml(System.Net.WebUtility.HtmlDecode(data));
                XmlNode rootnode = document.SelectSingleNode("root");
                XmlNodeList items = rootnode.SelectNodes("./a/div");
                foreach (XmlNode node in items)
                {
                    //This does not work anymore!
                    //The try fails here at line 574!
                    string value = node.SelectSingleNode("./div[contains(concat(' ', @class, ' '), ' market_listing_their_price ')]/span/span").InnerText;
                    string num = node.SelectSingleNode("./div[contains(concat(' ', @class, ' '), ' market_listing_num_listings ')]/span/span").InnerText;
                    string name = node.SelectSingleNode("./div/span[contains(concat(' ', @class, ' '), ' market_listing_item_name ')]").InnerText;
                    valueList.Add(value); //Lowest price for the item
                    numList.Add(num); //Volume of that item
                    nameList.Add(name); //Name of that item
                }
            }
            else { Thread.Sleep(60000); priceFromMarket(StartPage); }
        }
        catch { Thread.Sleep(60000); priceFromMarket(StartPage); }
    }

Answer 1

It's never reliable to parse HTML as XML because HTML doesn't have to be well formatted to be parsed properly... 将HTML解析为XML永远都不可靠，因为HTML的格式必须正确才能正确解析...

For parsing HTML in C# i prefer to use CSQuery https://www.nuget.org/packages/CsQuery/ 为了在C＃中解析HTML，我更喜欢使用CSQuery https://www.nuget.org/packages/CsQuery/

it lets you parse HTML in c# similar to doing it via jquery. 它使您可以通过c＃解析HTML，类似于通过jquery解析HTML。

Another way is HTML Agility Pack which you could probably use without changing much of your code.. it's functions are similar to the System.Xml.XmlDocument Library. 另一种方法是HTML Agility Pack，您可以在不更改大量代码的情况下使用它。其功能类似于System.Xml.XmlDocument库。

C＃尝试使用XmlNode阅读页面

问题描述

1 个解决方案

解决方案1
3 2015-11-08 02:33:34

C＃尝试使用XmlNode阅读页面

问题描述

1 个解决方案

解决方案1 3 2015-11-08 02:33:34

解决方案1
3 2015-11-08 02:33:34