[英]How can I properly parse this XML file with xPath in Java?
This is my XML file that I need to parse:这是我需要解析的 XML 文件:
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2002-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology
society in England, the young survivors lay the
foundation for a new society.</description>
</book>
<book id="bk104">
<author>Corets, Eva</author>
<title>Oberon's Legacy</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2001-03-10</publish_date>
<description>In post-apocalypse England, the mysterious
agent known only as Oberon helps to create a new life
for the inhabitants of London. Sequel to Maeve
Ascendant.</description>
</book>
<book id="bk105">
<author>Corets, Eva</author>
<title>The Sundered Grail</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2001-09-10</publish_date>
<description>The two daughters of Maeve, half-sisters,
battle one another for control of England. Sequel to
Oberon's Legacy.</description>
</book>
<book id="bk106">
<author>Randall, Cynthia</author>
<title>Lover Birds</title>
<genre>Romance</genre>
<price>4.95</price>
<publish_date>2003-09-02</publish_date>
<description>When Carla meets Paul at an ornithology
conference, tempers fly as feathers get ruffled.</description>
</book>
<book id="bk107">
<author>Thurman, Paula</author>
<title>Splish Splash</title>
<genre>Romance</genre>
<price>4.95</price>
<publish_date>2004-11-02</publish_date>
<description>A deep sea diver finds true love twenty
thousand leagues beneath the sea.</description>
</book>
<book id="bk108">
<author>Knorr, Stefan</author>
<title>Creepy Crawlies</title>
<genre>Horror</genre>
<price>4.95</price>
<publish_date>2005-12-06</publish_date>
<description>An anthology of horror stories about roaches,
centipedes, scorpions and other insects.</description>
</book>
<book id="bk109">
<author>Kress, Peter</author>
<title>Paradox Lost</title>
<genre>Science Fiction</genre>
<price>6.95</price>
<publish_date>2006-11-02</publish_date>
<description>After an inadvertant trip through a Heisenberg
Uncertainty Device, James Salway discovers the problems
of being quantum.</description>
</book>
<book id="bk110">
<author>O'Brien, Tim</author>
<title>Microsoft .NET: The Programming Bible</title>
<genre>Computer</genre>
<price>36.95</price>
<publish_date>2006-12-09</publish_date>
<description>Microsoft's .NET initiative is explored in
detail in this deep programmer's reference.</description>
</book>
<book id="bk111">
<author>O'Brien, Tim</author>
<title>MSXML3: A Comprehensive Guide</title>
<genre>Computer</genre>
<price>36.95</price>
<publish_date>2007-12-01</publish_date>
<description>The Microsoft MSXML3 parser is covered in
detail, with attention to XML DOM interfaces, XSLT processing,
SAX and more.</description>
</book>
<book id="bk112">
<author>Galos, Mike</author>
<title>Visual Studio 7: A Comprehensive Guide</title>
<genre>Computer</genre>
<price>49.95</price>
<publish_date>2008-04-16</publish_date>
<description>Microsoft Visual Studio 7 is explored in depth,
looking at how Visual Basic, Visual C++, C#, and ASP+ are
integrated into a comprehensive development
environment.</description>
</book>
</catalog>
I want to show every book and its information that has a publish date after 2005. and the price is bigger than 10. This is my Java code:我想显示出版日期在 2005 年之后的每本书及其信息。价格大于 10。这是我的 Java 代码:
package xml;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class Main {
public static void main(String[] args) throws XPathExpressionException, FileNotFoundException {
XPathFactory factory = XPathFactory.newInstance();
XPath xPath = factory.newXPath();
XPathExpression xPathExpression = xPath.compile("catalog/book[publish_date>2005]/price | catalog/book[price>10]/price");
File xmlDocument = new File("Books.xml");
InputSource inputSource = new InputSource(new FileInputStream(xmlDocument));
Object result = xPathExpression.evaluate(inputSource, XPathConstants.NODESET);
NodeList nodeList = (NodeList)result;
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println("Info: " + nodeList.item(i).getFirstChild().getNodeValue());
}
}
}
Adding Lorem ipsum so the question can post: Lorem Ipsum is simply dummy text of the printing and typesetting industry.添加 Lorem ipsum 以便可以发布问题:Lorem Ipsum 只是印刷和排版行业的虚拟文本。 Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.
自 1500 年代以来,Lorem Ipsum 一直是行业的标准虚拟文本,当时一位不知名的印刷商采用了一种类型的厨房并将其加扰以制作一本类型样本书。 It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
它不仅经历了五个世纪,而且经历了电子排版的飞跃,基本保持不变。 It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
它在 1960 年代随着包含 Lorem Ipsum 段落的 Letraset 表的发布而流行起来,最近还随着 Aldus PageMaker 等桌面出版软件(包括 Lorem Ipsum 的版本)而普及。
You almost have it right.你几乎是对的。 The problem is that, according the specification, a numeric comparison (
<
or >
) requires implicitly converting each operand to a number.问题在于,根据规范,数字比较(
<
或>
)需要将每个操作数隐式转换为数字。 A node's text content is only a valid number if it consists entirely of ASCII digits, with an optional leading minus, optional period, and optional surrounding whitespace.如果节点的文本内容完全由 ASCII 数字组成,并且带有可选的前导减号、可选的句点和可选的周围空格,则它只是一个有效数字。
A date like 2002-12-16
obviously does not qualify.像
2002-12-16
这样的日期显然不符合条件。 However, you can turn that into a string that can be implicitly converted into a number, using substring-before :但是,您可以使用substring-before将其转换为可以隐式转换为数字的字符串:
XPathExpression xPathExpression = xPath.compile(
"catalog/book[substring-before(publish_date,'-')>2005 and price>10]/price");
Take advantage of the XML date format and do a string comparison there, annd combine your conditions利用 XML 日期格式并在那里进行字符串比较,并结合您的条件
/catalog/book[(publish_date > '2005') and (number(price) > 10)]
And thus因此
XPathExpression xPathExpression = xPath.compile("/catalog/book[(publish_date > '2005') and (number(price) > 10)]");
NodeList bookNodes = (NodeList)xPathExpression.evaluate(inputSource, XPathConstants.NODESET);
for (int i = 0; i < bookNodes.getLength(); i++) {
Element bookElement = bookNodes.item(i);
System.out.println("Author: " + bookElement.getElementsByTagName("author").item(0).getNodeValue());
}
You'll need to add the remaining, necessary tags.您需要添加剩余的必要标签。 Also, if you book elements might no all contain all expected node, you'll need to check the collection returned by getElementsByTagName()
此外,如果您预订的元素可能并非全部包含所有预期的节点,则需要检查 getElementsByTagName() 返回的集合
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.