I have an application that collects information from various sites to fill my database. I'm stuck on this site that has a captcha. I need to show the captcha for my users. The problem is that the image is in a background-image
in a DIV
.
The DIV id
is captchaCodigo
.
I know how to get elements by id and name, work with values and etc. But i don't know how to obtain this picture or the url.
Thanks in advance
the captcha is very specific, if You use his web link, it will change its text.
I solved that mystery few days ago, the only way is to take a screenshot of this..
here is some lines of my code (it's not pretty, I am still working on finetune, but working perfectly):
main procedure , that saving bitmap to TImage:
procedure TForm1.elscreenshot(var elid:string;imid:integer); //elementID and ImageID
var
doc : IHTMLDocument2;
imgRange : IHTMLControlRange;
img : IHTMLImgElement;
render : IHTMLElementRender;
bmp : TBitmap;
_hdc : HDC;
img_NameProp : string;
img_idx,ii : Integer;
begin
doc := embeddedwb1.Document as IHTMLDocument2;
imgRange := ( doc.body as HTMLBody ).createControlRange as IHTMLControlRange;
for ii := imid to imid do
begin
img_idx:=ii;
repeat
img := doc.images.item( img_idx, EmptyParam ) as IHTMLImgElement;
application.ProcessMessages;
inc(img_idx);
until pos('andom',img.href)>0; // USUALLY CAPTCHA HAS "RANDOM" WORD IN ITS HREF - IN YOUR CASE IT's name is uuidCaptcha
img_NameProp := Utf8ToAnsi( UTF8Encode( img.nameProp ) );
begin
render := ( img as IHTMLElementRender );
bmp := TBitmap.Create;
try
bmp.Width := img.Width;
bmp.Height := img.Height;
_hdc := bmp.Canvas.Handle;
render.DrawToDC( _hdc );
Image1.Picture.Assign( bmp ); // <- HERE is happening the screenshot
cxtextedit1.setfocus; //focusing my edit box for user interaction
finally
bmp.Free;
end;
break;
end;
end;
end;
another code i need for that (to get element ID and Image ID:
procedure TForm1.getcaptcha(var i:integer);
//function in procedure body:
function GetElementsByClassName(ADoc: IDispatch; const strClassName: string): IHTMLElement;
var
vDocument: IHTMLDocument2;
vElementsAll: IHTMLElementCollection;
vElement: IHTMLElement;
I, ElementCount: Integer;
begin
Result := nil;
ElementCount := 0;
if not Supports(ADoc, IHTMLDocument2, vDocument) then
raise Exception.Create('Invalid HTML document');
vElementsAll := vDocument.all;
for I := 0 to vElementsAll.length - 1 do
if Supports(vElementsAll.item(I, EmptyParam), IHTMLElement, vElement) then
if SameText(vElement.className, strClassName) then
begin
Result := vElement;
end;
end;
the procedure
var
x:integer;
Doc3 : IHTMLDocument3;
cpt : ihtmlelement;
el:string;
begin
Doc3 := EmbeddedWB1.Document as IHTMLDocument3;
cpt:=nil;
repeat application.processmessages;until embeddedwb1.ReadyState>2;
repeat
application.ProcessMessages;
cpt:=getelementsbyclassname(doc3,'input-captcha input-text'); //here You must fint how its Your element named (use getinnertext and parse, or something similar
until cpt<>nil; //this repeat..until waits for captcha image to be loaded, usually it takes longer than the rest of webpage code (on slower connections)
if assigned(cpt) then begin
el:=cpt.id;
elscreenshot(el,x);
end;
end;
hope it helps :)
You can't obtain URL to this picture as it is not transferred through standard HTML protocol.
Pictures in such scenarios are usually transferred as data streams and then shown properly on client side.
In your case the image is actually transferred from server to your computer as base64
string which is then decoded to actual image on your computer.
You could obtain this info with a little clever use of Google Chrome developer tools
that can be activated with F12 key.
The main reason why this is implemented in such way is to prevent web bots to overcome the captcha
security.
In fact most sites that use captcha
protection system use it for a reason.
Most common reason is to prevent web boots to cause havoc on servers by posting various SPAM content on the site.
Another reason is to prevent them from overloading the servers by downloading all site content.
In fact you trying to gather information from that site and storing it in your own database might be in direct violation of sites usage agreement.
And based that the mentioned site is connected to jurisdictional system I'm willing to bet that any information that is posted on the site should not be copied or redistributed in any way.
The image is base64 encoding
, You need to garb it form the html code and convert it to bitmap do NOT forget to send the uuidCaptcha
with your post request that is the ID that identify the captcha you entered in your program.
uses Soap.EncdDecd, IdHTTP, System.StrUtils, pngimage;
Function _ExtractBetweenTags(Const s, LastTag, FirstTag: string; TrimTags: Boolean = True): string;
var
pLast,pFirst,pNextFirst : Integer;
begin
pFirst := Pos(FirstTag,s);
pLast := Pos(LastTag,s);
while (pLast > 0) and (pFirst > 0) do begin
if (pFirst > pLast) then // Find next LastTag
pLast := PosEx(LastTag,s,pLast+Length(LastTag))
else
begin
pNextFirst := PosEx(FirstTag,s,pFirst+Length(FirstTag));
if (pNextFirst = 0) or (pNextFirst > pLast) then begin
if TrimTags then begin
Result := Trim(StringReplace(Trim(Copy(s,pFirst,pLast-pFirst+Length(LastTag))), LastTag, '', [rfReplaceAll, rfIgnoreCase]));
Result := Trim(StringReplace(Result, FirstTag, '', [rfReplaceAll, rfIgnoreCase]));
end
else
Result := Trim(Copy(s,pFirst,pLast-pFirst+Length(LastTag)));
Exit;
end
else
pFirst := pNextFirst;
end;
end;
Result := '';
end;
procedure TForm4.btn1Click(Sender: TObject);
var
Input: TStringStream;
Output: TBytesStream;
sTmp, uuidCaptcha, captchaCodigo: string;
IdHTTP: TIdHTTP;
Graphic: TGraphic;
begin
IdHTTP := TIdHTTP.Create(nil);
try
IdHTTP.AllowCookies := True;
IdHTTP.HandleRedirects := True;
IdHTTP.Request.Connection := 'keep-alive';
IdHTTP.Request.UserAgent := 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36';
sTmp := IdHTTP.Get('http://www.tjms.jus.br/cpopg5/imagemCaptcha.do');
uuidCaptcha := _ExtractBetweenTags(sTmp, '"}', '"uuidCaptcha": "'); // You need this when you send the post request
captchaCodigo := _ExtractBetweenTags(sTmp, '", "labelValorCaptcha":', 'base64,');
mmo1.Lines.Add(captchaCodigo);
Input := TStringStream.Create(captchaCodigo, TEncoding.ASCII);
try
Output := TBytesStream.Create;
try
Soap.EncdDecd.DecodeStream(Input, Output);
Output.Position := 0;
Graphic := TPngImage.Create;
try
Graphic.LoadFromStream(Output);
img1.Picture.Bitmap.Assign(Graphic); // Your Image loads here
finally
Graphic.Free;
end;
finally
Output.Free;
end;
finally
Input.Free;
end;
finally
IdHTTP.Free;
end;
end;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.