简体   繁体   English

无法使用Selenium WebDriver从网页读取文本

[英]Not able to read text from webpage using selenium webdriver

i am not able to read the email id from the webpage below : 我无法从以下网页中读取电子邮件ID:

URL : https://targetstudy.com/university/2/acharya-ng-ranga-agricultural-university/ 网址: https//targetstudy.com/university/2/acharya-ng-ranga-agricultural-university/

Here is my code 这是我的代码

driver.navigate().to(URL);
String Email = driver.findElement(By.xpath("//*[@id="site-canvas"]/div[6]/div[2]/div[1]/div/div[1]/div/table/tbody/tr/td[2]/table/tbody/tr[4]/td[2]/img")).getText();
System.out.println(Email);

Selenium alone can't help you in this case, though your binding language would help you. 在这种情况下,仅硒不能帮助您,尽管您的绑定语言会帮助您。

You need Java Tesseract API. 您需要Java Tesseract API。

Code for extracting text : 提取文本的代码

 public String getImgText(String imageLocation) {
      ITesseract instance = new Tesseract();
      try 
      {
         String imgText = instance.doOCR(new File(imageLocation));
         return imgText;
      } 
      catch (TesseractException e) 
      {
         e.getMessage();
         return "Error while reading image";
      }
   }

If you are using maven for your Project, just add this dependency : 如果您在项目中使用maven,只需添加以下依赖项:

<dependency> 
 <groupId>net.sourceforge.tess4j</groupId> 
 <artifactId>tess4j</artifactId> 
 <version>3.2.1</version> 
</dependency>   

More Reference : Extracting text from Image 更多参考: 从图像中提取文本

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM