[英]How to get xml page by url
Ok, so I got some url link like https://stackoverflow.com/ and I'm trying to parse it in document but getting error.好的,所以我得到了一些像https://stackoverflow.com/这样的 url 链接,我试图在文档中解析它,但出现错误。 Why?
为什么? Because this is not xml file, so the question is how can I get data as xml if i got only url?
因为这不是 xml 文件,所以问题是如果我只有 url,如何将数据作为 xml 获取? My code:
我的代码:
public class URLReader {
public static void main(String[] args) throws Exception {
// or if you prefer DOM:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new URL("https://stackoverflow.com/").openStream());
int nodes = doc.getChildNodes().getLength();
System.out.println(nodes + " nodes found");
}
}
To parse HTML you may use JSOUP: https://jsoup.org/要解析 HTML,您可以使用 JSOUP: https ://jsoup.org/
This library provides also some features to transform HTML to XHTML, which some sort of XML:该库还提供了一些将 HTML 转换为 XHTML 的功能,即某种 XML:
Document document = Jsoup.parse(html);
document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
document.outputSettings().escapeMode(org.jsoup.nodes.Entities.EscapeMode.xhtml);
String xhtml=document.html();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.