简体   繁体   English

HttpClient-返回的内容与浏览器不同

[英]HttpClient - Different content returned than browser

I'm trying to make a request to kicksusa.com. 我正在尝试向kicksusa.com发出请求。 If I make the request from any browser, I get the full expected HTML, however, I cannot seem to simulate the request in a way that returns the same HTML, instead I get a 'Request unsuccessful.' 如果我从任何浏览器发出请求,都将获得完整的预期HTML,但是,我似乎无法以返回相同HTML的方式模拟该请求,而是得到了“请求失败”。 message. 信息。

Any help is appreciated 任何帮助表示赞赏

My code: 我的代码:

HttpClientHandler httpClientHandler = new HttpClientHandler()
{
    //Proxy = proxy,
    AllowAutoRedirect = true,
    MaxAutomaticRedirections = 15,
    AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate | DecompressionMethods.None
};

var client = new HttpClient();
client.DefaultRequestHeaders.Add("Host", "www.kicksusa.com");
client.DefaultRequestHeaders.Add("Connection", "keep-alive");
client.DefaultRequestHeaders.Add("Upgrade-Insecure-Requests", "1");
client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.87 Safari/537.36");
client.DefaultRequestHeaders.Add("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
client.DefaultRequestHeaders.Add("Accept-Encoding", "gzip, deflate, sdch");
client.DefaultRequestHeaders.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6");


var _response = await client.GetAsync("http://www.kicksusa.com/jordan-craig/oil-stain-slub-tee-army-green-8909ag.html");

if (_response.IsSuccessStatusCode)
{
    var _html = await _response.Content.ReadAsStringAsync();
}

Fiddler trace headers: 提琴手跟踪头:

Host: www.kicksusa.com
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.87 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-GB,en-US;q=0.8,en;q=0.6

This website uses some dedicated technology from Incapsula to prevent automated access to the website. 该网站使用Incapsula的某些专用技术来防止自动访问该网站。

On the first request, the site returns a web document with an embedded iframe. 根据第一个请求,该网站返回带有嵌入式iframe的网络文档。 Only when the iframe source is then loaded, a cookie is set and a redirect to the page happens. 仅当随后加载iframe源时,才会设置cookie并重定向到页面。 All further requests will then succeed immediately because the browser sends the cookie information. 然后,所有其他请求将立即成功,因为浏览器发送了cookie信息。

In order to circumvent the mechanism, you would have to load the iframe after the first request, remember the cookie and then send the cookie for all further requests. 为了规避该机制,您必须在第一个请求之后加载iframe,记住该cookie,然后为所有其他请求发送该cookie。 There's also a lot of JavaScript code involved in the first answer which would probably have to be executed for the Incapsula check to succeed. 为了使Incapsula检查成功,第一个答案中还涉及很多JavaScript代码。

However, when the site specifically uses such a technology to prevent automatic access to its content, any attempt to circumvent this mechanism, must be considered undesired and as a criminal act. 但是,当站点专门使用这种技术来防止自动访问其内容时,任何企图规避此机制的尝试都必须视为不希望的,并且是犯罪行为。 You should not try to automatically gather data from a site without its owner's approval, specifically not when such a technology as Incapusla is used to make this more difficult. 未经所有者的同意,您不应该尝试自动从站点收集数据,尤其是当使用Incapusla这样的技术使站点变得更困难时,尤其如此。

See also this answer by an Incapsula employee for more details. 有关更多详细信息,另请参见Incapsula员工的此答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 HttpClient-正文内容未返回? - HttpClient - body content is not returned? 使用 C# HttpClient 下载的 Skype Emoticon 与使用浏览器下载的字节 stream 不同 - Skype Emoticon downloaded with C# HttpClient is a different byte stream than downloaded with browser 使用HttpClient和浏览器时收到不同的响应 - Different response received when using HttpClient and browser 如何确定.NET HttpClient返回的内容是否为Gzip? - How to determine whether content returned by .NET HttpClient is Gzipped? 应用程序中返回的记录数与数据库中不同 - The number of records returned is different in the application than in the database 在 Powershell 中运行时与在 Visual Studio 中运行时的 HttpClient 并发行为不同 - HttpClient concurrent behavior different when running in Powershell than in Visual Studio HttpClient GetAsync响应内容与Fiddler给我的内容不同 - HttpClient GetAsync response content is different from what Fiddler is giving me MP4 内容类型在不同浏览器中返回为不同的 MIME 类型 - MP4 content type returned as different MIME type in different browsers WebRequest正在检索与浏览器不同的HTML - WebRequest is Retrieving different HTML than the Browser HttpWebRequest返回与Web浏览器不同的结果 - HttpWebRequest returns different results than web browser
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM