繁体   English   中英

如何在 Java 中使用 xPath 正确解析此 XML 文件?

[英]How can I properly parse this XML file with xPath in Java?

这是我需要解析的 XML 文件:

<?xml version="1.0"?>
<catalog>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
   </book>
   <book id="bk102">
      <author>Ralls, Kim</author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2002-12-16</publish_date>
      <description>A former architect battles corporate zombies, 
      an evil sorceress, and her own childhood to become queen 
      of the world.</description>
   </book>
   <book id="bk103">
      <author>Corets, Eva</author>
      <title>Maeve Ascendant</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-11-17</publish_date>
      <description>After the collapse of a nanotechnology 
      society in England, the young survivors lay the 
      foundation for a new society.</description>
   </book>
   <book id="bk104">
      <author>Corets, Eva</author>
      <title>Oberon's Legacy</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-03-10</publish_date>
      <description>In post-apocalypse England, the mysterious 
      agent known only as Oberon helps to create a new life 
      for the inhabitants of London. Sequel to Maeve 
      Ascendant.</description>
   </book>
   <book id="bk105">
      <author>Corets, Eva</author>
      <title>The Sundered Grail</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2001-09-10</publish_date>
      <description>The two daughters of Maeve, half-sisters, 
      battle one another for control of England. Sequel to 
      Oberon's Legacy.</description>
   </book>
   <book id="bk106">
      <author>Randall, Cynthia</author>
      <title>Lover Birds</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2003-09-02</publish_date>
      <description>When Carla meets Paul at an ornithology 
      conference, tempers fly as feathers get ruffled.</description>
   </book>
   <book id="bk107">
      <author>Thurman, Paula</author>
      <title>Splish Splash</title>
      <genre>Romance</genre>
      <price>4.95</price>
      <publish_date>2004-11-02</publish_date>
      <description>A deep sea diver finds true love twenty 
      thousand leagues beneath the sea.</description>
   </book>
   <book id="bk108">
      <author>Knorr, Stefan</author>
      <title>Creepy Crawlies</title>
      <genre>Horror</genre>
      <price>4.95</price>
      <publish_date>2005-12-06</publish_date>
      <description>An anthology of horror stories about roaches,
      centipedes, scorpions  and other insects.</description>
   </book>
   <book id="bk109">
      <author>Kress, Peter</author>
      <title>Paradox Lost</title>
      <genre>Science Fiction</genre>
      <price>6.95</price>
      <publish_date>2006-11-02</publish_date>
      <description>After an inadvertant trip through a Heisenberg
      Uncertainty Device, James Salway discovers the problems 
      of being quantum.</description>
   </book>
   <book id="bk110">
      <author>O'Brien, Tim</author>
      <title>Microsoft .NET: The Programming Bible</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2006-12-09</publish_date>
      <description>Microsoft's .NET initiative is explored in 
      detail in this deep programmer's reference.</description>
   </book>
   <book id="bk111">
      <author>O'Brien, Tim</author>
      <title>MSXML3: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>36.95</price>
      <publish_date>2007-12-01</publish_date>
      <description>The Microsoft MSXML3 parser is covered in 
      detail, with attention to XML DOM interfaces, XSLT processing, 
      SAX and more.</description>
   </book>
   <book id="bk112">
      <author>Galos, Mike</author>
      <title>Visual Studio 7: A Comprehensive Guide</title>
      <genre>Computer</genre>
      <price>49.95</price>
      <publish_date>2008-04-16</publish_date>
      <description>Microsoft Visual Studio 7 is explored in depth,
      looking at how Visual Basic, Visual C++, C#, and ASP+ are 
      integrated into a comprehensive development 
      environment.</description>
   </book>
</catalog>

我想显示出版日期在 2005 年之后的每本书及其信息。价格大于 10。这是我的 Java 代码:

package xml;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;

public class Main {

    public static void main(String[] args) throws XPathExpressionException, FileNotFoundException {

        XPathFactory factory = XPathFactory.newInstance();
        XPath xPath = factory.newXPath();

        XPathExpression xPathExpression = xPath.compile("catalog/book[publish_date>2005]/price | catalog/book[price>10]/price");

        File xmlDocument = new File("Books.xml");
        InputSource inputSource = new InputSource(new FileInputStream(xmlDocument));

        Object result = xPathExpression.evaluate(inputSource, XPathConstants.NODESET);

        NodeList nodeList = (NodeList)result;

        for (int i = 0; i < nodeList.getLength(); i++) {
            System.out.println("Info: " + nodeList.item(i).getFirstChild().getNodeValue());
        }

    }

}

我应该如何使用此查询正确执行此操作?

添加 Lorem ipsum 以便可以发布问题:Lorem Ipsum 只是印刷和排版行业的虚拟文本。 自 1500 年代以来,Lorem Ipsum 一直是行业的标准虚拟文本,当时一位不知名的印刷商采用了一种类型的厨房并将其加扰以制作一本类型样本书。 它不仅经历了五个世纪,而且经历了电子排版的飞跃,基本保持不变。 它在 1960 年代随着包含 Lorem Ipsum 段落的 Letraset 表的发布而流行起来,最近还随着 Aldus PageMaker 等桌面出版软件(包括 Lorem Ipsum 的版本)而普及。

你几乎是对的。 问题在于,根据规范,数字比较( <> )需要将每个操作数隐式转换为数字。 如果节点的文本内容完全由 ASCII 数字组成,并且带有可选的前导减号、可选的句点和可选的周围空格,则它只是一个有效数字。

2002-12-16这样的日期显然不符合条件。 但是,您可以使用substring-before将其转换为可以隐式转换为数字的字符串:

XPathExpression xPathExpression = xPath.compile(
    "catalog/book[substring-before(publish_date,'-')>2005 and price>10]/price");

利用 XML 日期格式并在那里进行字符串比较,并结合您的条件

/catalog/book[(publish_date > '2005') and (number(price) > 10)]

因此

XPathExpression xPathExpression = xPath.compile("/catalog/book[(publish_date > '2005') and (number(price) > 10)]");
NodeList bookNodes = (NodeList)xPathExpression.evaluate(inputSource, XPathConstants.NODESET);
for (int i = 0; i < bookNodes.getLength(); i++) {
    Element bookElement = bookNodes.item(i);
    System.out.println("Author: " + bookElement.getElementsByTagName("author").item(0).getNodeValue());
}

您需要添加剩余的必要标签。 此外,如果您预订的元素可能并非全部包含所有预期的节点,则需要检查 getElementsByTagName() 返回的集合

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM