简体   繁体   中英

How to get html content from amazon using HttpWebRequest

I am trying to get HTML content from the amazon website. Here is my code to create request, response, and get string:

       public static HttpWebResponse GetHttpWebResponse(string url)
    {
        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
        webRequest.ContentType = "text/xml";
        try
        {
            return (HttpWebResponse)webRequest.GetResponse();
        }
        catch (WebException e)
        {
            if (e.Response == null)
                throw new Exception("Cannot get response");
            return (HttpWebResponse)e.Response;
        }
    }

    public static string GetString(HttpWebResponse response)
    {
        Encoding encoding = Encoding.UTF8;
        using (var reader = new StreamReader(response.GetResponseStream(), encoding))
        {
            string responseText = reader.ReadToEnd();
            return responseText;
        }
    }

It is working fine with other web sites. However, when I try to get content from amazon, for example: https://www.amazon.com/gp/product/B00AEISSHA/ref=ppx_yo_dt_b_asin_title_o00_s00?ie=UTF8&psc=1 I am seeing encoded content:

编码内容

I tried to change Encoding and used HttpUtility.HtmlDecode(html); but it couldn't help. Is there any simple way to get content from Amazon?

You're not catering for compression. If you update your webrequest like this, it should do the trick.

public static HttpWebResponse GetHttpWebResponse(string url)
{
    HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
    webRequest.ContentType = "text/xml";
    webRequest.AutomaticDecompression = DecompressionMethods.GZip;
    try
    {
        return (HttpWebResponse)webRequest.GetResponse();
    }
    catch (WebException e)
    {
        if (e.Response == null)
            throw new Exception("Cannot get response");
        return (HttpWebResponse)e.Response;
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM