简体   繁体   中英

Not able to read text from webpage using selenium webdriver

i am not able to read the email id from the webpage below :

URL : https://targetstudy.com/university/2/acharya-ng-ranga-agricultural-university/

Here is my code

driver.navigate().to(URL);
String Email = driver.findElement(By.xpath("//*[@id="site-canvas"]/div[6]/div[2]/div[1]/div/div[1]/div/table/tbody/tr/td[2]/table/tbody/tr[4]/td[2]/img")).getText();
System.out.println(Email);

Selenium alone can't help you in this case, though your binding language would help you.

You need Java Tesseract API.

Code for extracting text :

 public String getImgText(String imageLocation) {
      ITesseract instance = new Tesseract();
      try 
      {
         String imgText = instance.doOCR(new File(imageLocation));
         return imgText;
      } 
      catch (TesseractException e) 
      {
         e.getMessage();
         return "Error while reading image";
      }
   }

If you are using maven for your Project, just add this dependency :

<dependency> 
 <groupId>net.sourceforge.tess4j</groupId> 
 <artifactId>tess4j</artifactId> 
 <version>3.2.1</version> 
</dependency>   

More Reference : Extracting text from Image

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM