如何使用PDFBox将文本提取到JLabel

Question

I haven't been coding for long and decided to write a program that would download the current Official Golf World Rankings in PDF form and then display the top 10 using JLabels. 我已经进行了很长时间的编码，因此决定编写一个程序，该程序将以PDF格式下载当前的《高尔夫世界官方排名》，然后使用JLabel显示前十名。

While the program is able to download the file I have been unable to find out how to extract individuals cells from the table containing the data ie extract "This Week", "Name", "Country" columns to individual arrays. 虽然程序可以下载文件，但我无法找出如何从包含数据的表中提取单个单元格，即将“本周”，“名称”，“国家”列提取到单个数组中。

Could someone please give me some advice on how I would go about doing this? 有人可以给我一些建议，我该怎么做吗？

Answer 1

I recently had to do something similar, my code looks like this (using PDFBox): 我最近不得不做类似的事情，我的代码如下所示（使用PDFBox）：

PDFParser pdfParser = new PDFParser(new FileInputStream("c:\\temp\\owgr49f2013.pdf"));
pdfParser.parse();
PDDocument pdDocument = pdfParser.getPDDocument();

PDFTextStripper stripper = new PDFTextStripper("UTF-8");
stripper.setSortByPosition(false);
stripper.setWordSeparator("###");
System.out.println(stripper.getText(pdDocument));

You'll need to extract the information you need from the resulting text with regular expressions or so. 您需要使用正则表达式等从结果文本中提取所需的信息。

如何使用PDFBox将文本提取到JLabel

问题描述

1 个解决方案

解决方案1
0 2013-12-20 08:55:09

如何使用PDFBox将文本提取到JLabel

问题描述

1 个解决方案

解决方案1 0 2013-12-20 08:55:09

解决方案1
0 2013-12-20 08:55:09