如何通过Java程序读取网页内容？

Question

I am planning to write a Java program to read some exchange rates from a web site (http://www.doviz.com) and was wondering what is the best approach to only read (or read the whole and strip the parts needed) the content that I need. 我打算编写一个Java程序来从网站（http://www.doviz.com）读取汇率，并且想知道什么是仅读取（或读取全部内容并剥离所需部分）的最佳方法？我需要的内容。

Any help is appreciated. 任何帮助表示赞赏。

Answer 1

My advice is to use the Jsoup library 我的建议是使用Jsoup库

It's very easy to parse an external content with a css/jquery-like syntax 使用类似于css / jquery的语法来解析外部内容非常容易

// Only one line to parse an external content
Document doc = Jsoup.connect("http://jsoup.org").get();

// "Javascript-like" syntax
Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
  String linkHref = link.attr("href");
  String linkText = link.text();
}

// "Jquery/Css-like" syntax
Elements resultLinks = doc.select("h3.r > a");
Elements pngs = doc.select("img[src$=.png]");

Just add the jsoup.jar library to your classpath and enjoy ! 只需将jsoup.jar库添加到您的类路径中，即可享受！
Open-Source and free to use of course. 开源，当然可以免费使用。

Answer 2

我建议您（以编程方式）实现网页的RSS读取机制，并使用标准解析器提取RSS xml的内容。

如何通过Java程序读取网页内容？

问题描述

2 个解决方案

解决方案1
6 已采纳 2011-08-19 22:41:13

解决方案2
1 2011-08-20 07:39:46

如何通过Java程序读取网页内容？

问题描述

2 个解决方案

解决方案1 6 已采纳 2011-08-19 22:41:13

解决方案2 1 2011-08-20 07:39:46

解决方案1
6 已采纳 2011-08-19 22:41:13

解决方案2
1 2011-08-20 07:39:46