如何使用硒读取pdf文件

Question

我正在一个有链接的网页上工作，单击该链接会在新窗口上打开pdf文件。 我必须阅读该pdf文件，以对照完成的交易验证一些数据。 一种方法是下载该文件，然后使用它。 谁能帮我这个忙。 我必须在IE 11上工作

提前致谢。

Answer 1

使用PDFBox和FontBox。

    public String readPDFInURL() throws EmptyFileException, IOException {
        WebDriver driver = new FirefoxDriver();
        // page with example pdf document
        driver.get("file:///C:/Users/admin/Downloads/dotnet_TheRaceforEmpires.pdf");
        URL url = new URL(driver.getCurrentUrl());
        InputStream is = url.openStream();
        BufferedInputStream fileToParse = new BufferedInputStream(is);
        PDDocument document = null;
        try {
            document = PDDocument.load(fileToParse);
            String output = new PDFTextStripper().getText(document);
        } finally {
            if (document != null) {
                document.close();
            }
            fileToParse.close();
            is.close();
        }
        return output;
    }

由于不赞成使用旧版本的PDFBox中的某些功能，因此我们需要将另一个FontBox与PDFBox一起使用。 我用过PDFBox（2.0.3）和FontBox（2.0.3），它工作正常。 它不会读取图像。

Answer 2

第一个Downlaod pdfbox jar。

strURL是一个包含.pdf文件的网络URl：like（ https://example.com/downloads/presence/Online-Presence-CA-05-02-2017-04-13.pdf ）

public boolean verifyPDFContent(String strURL, String text) {

        String output ="";
        boolean flag = false;
        try{
            URL url = new URL(strURL);
            BufferedInputStream file = new BufferedInputStream(url.openStream());
            PDDocument document = null;
            try {
                document = PDDocument.load(file);
                output = new PDFTextStripper().getText(document);
                System.out.println(output);
            } finally {
                if (document != null) {
                    document.close();
                }
            }
        }catch(Exception e){
            e.printStackTrace();
        }
        if(output.contains(text)){
            flag =  true;
        }
        return flag;
    }

如何使用硒读取pdf文件

问题描述

2 个解决方案

解决方案1
4 2016-11-25 12:33:13

解决方案2
0 2017-05-02 05:57:33

如何使用硒读取pdf文件

问题描述

2 个解决方案

解决方案1 4 2016-11-25 12:33:13

解决方案2 0 2017-05-02 05:57:33

解决方案1
4 2016-11-25 12:33:13

解决方案2
0 2017-05-02 05:57:33