繁体   English   中英

Java - 从网站检索数据

[英]Java - Retrieving data from a website

我正在制作一个恢复彩票号码并将其显示在 window 中的应用程序。但是我不确定如何从网站恢复数据和号码:

https://www.national-lottery.co.uk/player/p/results.ftl

您 go 会怎么做? 我以前做过这个,但是有一个网站返回了一个我可以使用的数据字符串。 我更不确定如何做到这一点。 任何建议将不胜感激,该技术(如果有的话)将在我的更多项目中帮助我!

使用Jsoup检索和解析页面:

String url = "https://www.national-lottery.co.uk/player/p/results.ftl";
Document document = Jsoup.connect(url).get();
final Elements elementsByTag = document.getElementsByTag("table");
... then work with the table or any other element

该站点提供了一个下载 CSV 版本号码的链接。 只需使用它:

https://www.national-lottery.co.uk/player/lotto/results/downloadResultsCSV.ftl

看起来像:

DrawDate,Ball 1,Ball 2,Ball 3,Ball 4,Ball 5,Ball 6,Bonus Ball,Ball Set,Machine
07-Apr-2012,23,12,42,16,25,31,18,6,LANCELOT
04-Apr-2012,44,23,9,40,33,26,31,2,MERLIN
31-Mar-2012,2,49,40,47,18,5,19,1,MERLIN
28-Mar-2012,16,8,39,22,3,38,26,3,MERLIN
24-Mar-2012,24,27,6,39,31,45,32,4,LANCELOT
21-Mar-2012,10,14,45,25,39,21,40,1,MERLIN
17-Mar-2012,37,40,1,3,20,16,15,2,MERLIN
14-Mar-2012,15,36,26,31,14,18,48,4,MERLIN
10-Mar-2012,12,37,23,43,3,1,33,1,MERLIN
07-Mar-2012,28,44,8,35,11,2,17,3,MERLIN
03-Mar-2012,31,20,40,28,7,23,42,4,MERLIN
29-Feb-2012,41,29,46,14,49,13,43,3,LANCELOT
25-Feb-2012,29,27,26,7,32,25,33,1,LANCELOT
22-Feb-2012,35,12,7,49,43,15,8,4,MERLIN
18-Feb-2012,19,22,30,33,41,2,24,4,LANCELOT
15-Feb-2012,30,40,28,33,9,44,16,3,MERLIN
11-Feb-2012,24,31,23,1,49,45,6,3,LANCELOT
08-Feb-2012,7,13,31,44,36,16,26,8,LANCELOT
04-Feb-2012,41,45,7,40,48,4,46,2,MERLIN
01-Feb-2012,7,39,38,17,22,21,3,2,LANCELOT
28-Jan-2012,10,25,31,40,28,12,1,2,LANCELOT
25-Jan-2012,2,30,8,26,45,39,46,1,MERLIN
21-Jan-2012,17,5,32,39,49,42,19,5,MERLIN
18-Jan-2012,22,43,34,9,31,35,20,6,MERLIN
14-Jan-2012,7,12,10,15,25,42,33,7,LANCELOT
11-Jan-2012,40,33,39,9,2,27,45,6,LANCELOT
07-Jan-2012,47,8,15,17,14,20,38,7,MERLIN
04-Jan-2012,42,43,30,9,28,26,2,8,MERLIN
31-Dec-2011,11,38,42,37,44,7,2,7,LANCELOT
28-Dec-2011,48,11,49,13,17,8,19,6,LANCELOT
24-Dec-2011,43,32,36,15,23,1,19,7,LANCELOT
21-Dec-2011,30,7,28,34,38,45,6,5,MERLIN
17-Dec-2011,42,1,35,48,39,22,12,5,MERLIN
14-Dec-2011,3,43,30,28,10,25,31,8,MERLIN
10-Dec-2011,30,21,29,39,24,16,20,6,LANCELOT
07-Dec-2011,10,31,27,47,32,14,41,5,MERLIN
03-Dec-2011,49,1,35,48,47,30,8,8,MERLIN
30-Nov-2011,30,26,25,24,23,13,4,7,MERLIN
26-Nov-2011,13,36,26,16,25,46,15,6,MERLIN
23-Nov-2011,19,31,48,22,4,11,6,5,MERLIN
19-Nov-2011,32,31,1,34,29,36,45,3,ARTHUR
16-Nov-2011,26,40,39,27,10,12,20,1,GUINEVERE
12-Nov-2011,28,13,12,33,6,38,10,14,ARTHUR
09-Nov-2011,27,2,8,32,23,10,44,1,GUINEVERE
05-Nov-2011,14,24,39,23,16,27,43,8,LANCELOT
02-Nov-2011,12,38,11,33,37,49,3,2,GUINEVERE
29-Oct-2011,49,14,5,28,9,46,45,1,GUINEVERE
26-Oct-2011,4,23,34,41,38,39,27,4,GUINEVERE
22-Oct-2011,20,43,27,44,28,34,1,4,ARTHUR
19-Oct-2011,13,18,34,49,32,14,20,3,GUINEVERE
15-Oct-2011,41,7,12,46,34,27,14,2,ARTHUR
12-Oct-2011,37,26,40,25,13,24,30,3,ARTHUR

创建页面地址的 URL 表示。 打开与URL的连接。建立输入stream。从stream读取所有数据。这将是页面源。

URL url = new URL("https://www.national-lottery.co.uk/player/p/results.ftl");
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
byte[] data = new byte[stream.available()];

stream.read(data);
stream.close();

String source = new String(data);

除非该站点提供允许查询彩票号码的 api 或 web 服务,否则您可能必须抓取该页面的 html 来源。 看起来这些数字存储在一个简单的 html 列表中:

<ul>
  <li>12</li>
  <li>16</li>
  <li>23</li>
  <li>25</li>
  <li>31</li>
  <li>42</li>
  <li class="bonus">18</li>
</ul>

那里有很多好的 Java HTML 解析器。 这里有几个项目:

我环顾了您感兴趣的网站,似乎他们有一个“历史”页面,其中包含几天的彩票号码:

https://www.national-lottery.co.uk/player/lotto/results/results.ftl

这可能是一个更好的页面。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM