[英]Unable to load PDF from url
I am using selenium and i want to read a content from pdf which is open in url我正在使用 selenium 并且我想从 pdf 中读取内容,该内容在 url 中打开
String url=https://dms.careerbuilder.com/viewer?Token=e6c1c73dfd2e4e42b806f414f41ae6cd&key=574dda953a7bd92e0ab217d1a637d88b41926aab6033dee85660d385b335ac86
try {
String pdfContent = readPdfContent(url);
Assert.assertTrue(pdfContent.contains("Test Kumar"));
Assert.assertTrue(pdfContent.contains("XXXXX"));
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
below is the function which is call above error coming-'The method load(BufferedInputStream) is undefined for the type PDDocument'下面是 function 上面的错误调用 - '方法 load(BufferedInputStream) 未定义类型 PDDocument'
public static String readPdfContent(String url) throws IOException {
URL pdfUrl = new URL(url);
InputStream in = pdfUrl.openStream();
BufferedInputStream bf = new BufferedInputStream(in);
PDDocument doc = PDDocument.load(bf);
int numberOfPages = getPageCount(doc);
System.out.println("The total number of pages "+numberOfPages);
String content = new PDFTextStripper().getText(doc);
doc.close();
return content;
} }
public static int getPageCount(PDDocument doc) {
//get the total number of pages in the pdf document
int pageCount = doc.getNumberOfPages();
return pageCount;
} }
help me for this.帮我解决这个问题。 Thanks in advance提前致谢
maybe you import wrong library.也许你导入了错误的库。 If you have a maven
project, add this dependency:如果您有maven
项目,请添加此依赖项:
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.24</version>
</dependency>
and imported
libraries:和imported
的库:
import java.net.URL;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.text.PDFTextStripper;
import java.io.InputStream;
import java.io.BufferedInputStream;
import java.io.IOException;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.