简体   繁体   中英

How to read the text from image (captcha) by using Selenium WebDriver with Java

I have registration webpage but in last captcha is displaying..

I am not able to read the text from image. I am going to mention the code and output ..

@Test
public void loginTest() throws InterruptedException {
    System.out.println("Testing");
    driver.get("https://customer.onlinelic.in/ForgotPwd.htm");

    WebElement element = driver.findElement(By.xpath("//*[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"));
    System.out.println(" get the instance ");

    String elementTest = element.getAttribute("src");
    System.out.println("Element : " + elementTest);
}

Output: Error

Exception in thread "main" org.openqa.selenium.NoSuchElementException: Unable to locate element: {"method":"xpath","selector":"// [@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} Command duration or timeout: 60.02 seconds For documentation on this error, please visit: http://seleniumhq.org/exceptions/no_such_element.html Build info: version: '2.35.0', revision: '8df0c6b', time: '2013-08-12 15:43:19' System info: os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.6.0_26' Session ID: 5f5b2e1a-56a4-49ad-8fd3-2870747a7768 Driver info: org.openqa.selenium.firefox.FirefoxDriver Capabilities [{platform=XP, acceptSslCerts=true, javascriptEnabled=true, browserName=firefox, rotatable=false, locationContextEnabled=true, version=23.0.1, cssSelectorsEnabled=true, databaseEnabled=true, handlesAlerts=true, browserConnectionEnabled=true, nativeEvents=true, webStorageEnabled=true, applicationCacheEnabled=true, takesScreenshot=true}] at sun.reflect.NativeConstructorAccessorI mpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.openqa.selenium.remote.ErrorHandler.createThrowable(ErrorHandler.java:191) at org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:145) at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:554) at org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:307) at org.openqa.selenium.remote.RemoteWebDriver.findElementByXPath(RemoteWebDriver.java:404) at org.openqa.selenium.By$ByXPath.findElement(By.java:344) at org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:299) at seleniumtest.CaptchaTest.loginTest(CaptchaTest.java:41) at seleniumtest.CaptchaTest.main(CaptchaTest.java:59) Caused by: org.openq a.selenium.remote.ErrorHandler$UnknownServerException: Unable to locate element: {"method":"xpath","selector":"// [@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} Build info: version: '2.35.0', revision: '8df0c6b', time: '2013-08-12 15:43:19' System info: os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.6.0_26' Driver info: driver.version: unknown at .FirefoxDriver.prototype.findElementInternal_(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/fxdriver@googlecode.com/components/driver_component.js:8880) at .fxdriver.Timer.prototype.setTimeout/<.notify(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/fxdriver@googlecode.com/components/driver_component.js:396)

Just to elaborate the previous answers, CAPTCHA as an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart". So, if "machine" can solve it, it's not really do it's job.

In order to solve it, there is something you can do - to use API of external services such as http://www.deathbycaptcha.com . You implementing their API, passing them the CAPTCHA and get in return the text. The average solving time i have observed is around 10-15 seconds.

Example for implementation (taken from here )

import com.DeathByCaptcha.AccessDeniedException;
import com.DeathByCaptcha.Captcha;
import com.DeathByCaptcha.Client;
import com.DeathByCaptcha.SocketClient;
import com.DeathByCaptcha.HttpClient;

/* Put your DeathByCaptcha account username and password here.
   Use HttpClient for HTTP API. */
Client client = (Client)new SocketClient(username, password);
try {
    double balance = client.getBalance();

    /* Put your CAPTCHA file name, or file object, or arbitrary input stream,
       or an array of bytes, and optional solving timeout (in seconds) here: */
    Captcha captcha = client.decode(captchaFileName, timeout);
    if (null != captcha) {
        /* The CAPTCHA was solved; captcha.id property holds its numeric ID,
           and captcha.text holds its text. */
        System.out.println("CAPTCHA " + captcha.id + " solved: " + captcha.text);

        if (/* check if the CAPTCHA was incorrectly solved */) {
            client.report(captcha);
        }
    }
} catch (AccessDeniedException e) {
    /* Access to DBC API denied, check your credentials and/or balance */
}

Two problems.

  1. You have the wrong xpath so you getting a NoSuchElement exception.

  2. Even you had the right xpath, you would not be able to extract the text, as that would defeat the point if CAPTCHA

The whole purpose of CAPTCHA is to prevent automation from the UI! You may wanna use internal APIs for verifying the action.

I have a solution which will work for a specific website. You can get a snapshot of the whole page and get the image of captcha. Then divide the whole width of the captcha image by total number of characters (in a captcha generally it's usually constant). Now we have the individual characters of the captcha image. Collect all the possible characters of the captcha by reloading the page.

Once you have all the possible characters then given any captcha image you can compare its characters with the images that we have and decide which letter or number it is.

Steps to follow:

  1. Collect captcha image and divide it into individual characters.

     private static BufferedImage cropImage(File filePath, int x, int y, int w, int h) { try { BufferedImage originalImgage = ImageIO.read(filePath); BufferedImage subImgage = originalImgage.getSubimage(x, y, w, h); return subImgage; } catch (IOException e) { e.printStackTrace(); return null; } } 
    1. Keep all possible images in a folder 在此输入图像描述

    2. Now read each character image of the captcha and compare it with all other images in above folder. You can compare two images using pixel values public static float getDiff(File f1, File f2, int width, int height) throws IOException { BufferedImage bi1 = null; BufferedImage bi2 = null; bi1 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB); bi2 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);

        bi1 = ImageIO.read(f1); bi2 = ImageIO.read(f2); float diff = 0; for (int i = 0; i < width; i++) { for (int j = 0; j < height; j++) { int rgb1 = bi1.getRGB(i, j); int rgb2 = bi2.getRGB(i, j); int b1 = rgb1 & 0xff; int g1 = (rgb1 & 0xff00) >> 8; int r1 = (rgb1 & 0xff0000) >> 16; int b2 = rgb2 & 0xff; int g2 = (rgb2 & 0xff00) >> 8; int r2 = (rgb2 & 0xff0000) >> 16; diff += Math.abs(b1 - b2); diff += Math.abs(g1 - g2); diff += Math.abs(r1 - r2); } } return diff; } 
  2. Whichever images having less diff value that is the actual match. Append its name to a string.
  3. After reading all images of the captcha return string 1 : https://i.stack.imgur.com/FYPhd.png

In above picture image name specifies the digit or character.

This works only for simple captcha like [ 在此输入图像描述 1

在此输入图像描述

And here is the sample code to read the text from above image :

import java.awt.Image;
import java.awt.image.BufferedImage;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import javax.imageio.ImageIO;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import com.asprise.util.ocr.OCR;

public class ExtractImage {

 WebDriver driver;

 @BeforeTest
  public void setUpDriver() {
   driver = new FirefoxDriver();
  }

 @Test
 public void start() throws IOException{

 /*Navigate to http://www.mythoughts.co.in/2013/10/extract-and-verify-text-from-image.html page
  * and get the image source attribute
  *  
  */  
 driver.get("http://www.mythoughts.co.in/2013/10/extract-and-verify-text-from-image.html");
 String imageUrl=driver.findElement(By.xpath("//*[@id='post-body-5614451749129773593']/div[1]/div[1]/div/a/img")).getAttribute("src");
 System.out.println("Image source path : \n"+ imageUrl);

 URL url = new URL(imageUrl);
 Image image = ImageIO.read(url);
 String s = new OCR().recognizeCharacters((RenderedImage) image);
 System.out.println("Text From Image : \n"+ s);
 System.out.println("Length of total text : \n"+ s.length());
 driver.quit();

 /* Use below code If you want to read image location from your hard disk   
  *   
   BufferedImage image = ImageIO.read(new File("Image location"));   
   String imageText = new OCR().recognizeCharacters((RenderedImage) image);  
   System.out.println("Text From Image : \n"+ imageText);  
   System.out.println("Length of total text : \n"+ imageText.length());   

   */ 
}

}

Here is the output of the above program:

Image source path : http://2.bp.blogspot.com/-42SgMHAeF8U/Uk8QlYCoy-I/AAAAAAAADSA/TTAVAAgDhio/s1600/love.jpg

Never M2suse the O, ne Who Likes You Never Say Busy To Th,e One Who Needs You Never cheat The One Who ReaZZy Trust You, Never foJnget The One Who Zways Remember You.

Length of total text : 175

The forgot password form is in an iframe. That is the reason for selenium not finding the element. You need to switch to the iframe holding the form first, and then run your findelement. Your xpath is correct.

Use driver.switchTo().frame(arg0) for switching into the frame. See javadoc here

To get the captcha text, I didn't understand what you meant by 'store the test and compare'. Ideally you shouldn't be able to read the text from the captcha(As others have mentioned). One alternative approach I have seen is, storing the captcha value as alt text in the development and QA environment. So that you can read it and enter in the textbox. When the code goes to production or any outside environment, this alt text can be removed.

One can not read from CAPTCHA. If you can read from CAPTCHA, there is no point in using CAPTCHA.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM