简体   繁体   English

Delphi Indy TIdHTTP 网站识别机器人

[英]Delphi Indy TIdHTTP Website recognize robots

I'm try send a Get request to website.我正在尝试向网站发送Get请求。 The problem is that website is recongize if the requester is a robot问题是如果请求者是机器人,网站就会被识别

const _URL = 'https://www.URL.com/';
var
  sSessionID:String;
  Params: TStringList;
  IdSSL: TIdSSLIOHandlerSocketOpenSSL;
begin
  IdSSL := TIdSSLIOHandlerSocketOpenSSL.Create(IdHTTP1);
  try
    IdHTTP1.IOHandler := IdSSL;
    IdHTTP1.AllowCookies := True;
    IdHTTP1.HandleRedirects := True;
    IdHTTP1.Request.UserAgent := 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0';
    IdHTTP1.Request.Accept := 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
    IdHTTP1.Request.AcceptLanguage := 'en-GB,en;q=0.5';
    IdHTTP1.Request.Connection := 'keep-alive';
    IdHTTP1.Request.ContentType := 'application/x-www-form-urlencoded';
    sSessionID := IdHTTP1.Get(_URL);
    {....
        extracting SessionID
            Params.Add('SessionID=' + 'sSessionID');
                IdHTTP1.Post(_URL, Params);
                    .....}
  finally
    IdSSL.Free;
  end; 

The result of the IdHTTP.get is <!DOCTYPE html><head><META NAME="ROBOTS"..... Its empty i can't obtin the session ID. IdHTTP.get的结果是<!DOCTYPE html><head><META NAME="ROBOTS".....它是空的,我无法获得会话 ID。

The http request headers is the same what my borwser sent. http 请求标头与我的 borwser 发送的相同。

As I can have the real URL this is my best guess:因为我可以得到真实的 URL,所以这是我最好的猜测:

uses
  Math;
...
    const
      _URL = 'https://www.url.com/';
    var
      sSessionID: string;
      Params: TStringList;
      IdSSL: TIdSSLIOHandlerSocketOpenSSL;
    begin
      IdSSL := TIdSSLIOHandlerSocketOpenSSL.Create(IdHTTP1);
      try
        IdHTTP1.IOHandler := IdSSL;
        IdHTTP1.AllowCookies := True;
        IdHTTP1.HandleRedirects := True;
        IdHTTP1.Request.CustomHeaders.AddValue('X-Forwarded-For', Format('%d.%d.%d.%d', [Random(255), Random(255), Random(255), Random(255)]));
        IdHTTP1.Request.UserAgent := Format('Mozilla/%d.0 (Windows NT %d.%d; rv:2.0.1) Gecko/20100101 Firefox/%d.%d.%d', [RandomRange(3, 5), RandomRange(3, 5), Random(2), RandomRange(3, 5), Random(5), Random(5)]);
        IdHTTP1.Request.Accept := 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
        IdHTTP1.Request.AcceptLanguage := 'en-GB,en;q=0.5';
        IdHTTP1.Request.Connection := 'keep-alive';
        IdHTTP1.Request.ContentType := 'application/x-www-form-urlencoded';
        sSessionID := IdHTTP1.Get(_URL);
    ...
      finally
        ...
      end;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM