简体   繁体   English

Java:XML Parser

[英]Java:XML Parser

I have a response XML something like this - 我有一个像这样的响应XML -

<Response> <aa> <Fromhere> <a1>Content</a1> <a2>Content</a2> </Fromhere> </aa> </Response>

I want to extract the whole content from <Fromhere> to </Fromhere> in a string. 我想在字符串中从<Fromhere>提取整个内容到</Fromhere> Is it possible to do that through any string function or through XML parser? 是否可以通过任何字符串函数或通过XML解析器执行此操作?

Please advice. 请指教。

You could try an XPath approach for simpleness in XML parsing: 您可以尝试使用XPath方法来简化XML解析:

InputStream response = new ByteArrayInputStream("<Response> <aa> "
        + "<Fromhere> <a1>Content</a1> <a2>Content</a2> </Fromhere> "
        + "</aa> </Response>".getBytes()); /* Or whatever. */

DocumentBuilder builder = DocumentBuilderFactory
        .newInstance().newDocumentBuilder();
Document doc = builder.parse(response);

XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("string(/Response/aa/FromHere)");
String result = (String)expr.evaluate(doc, XPathConstants.STRING);

Note that I haven't tried this code. 请注意,我还没有尝试过这段代码。 It may need tweaking. 它可能需要调整。

Through an XML parser. 通过XML解析器。 Using string functions to parse XML is a bad idea... 使用字符串函数来解析XML是一个坏主意......
Beside the Sun tutorials pointed out above, you can check the DZone Refcardz on Java and XML , I found it was a good, terse explanation how to do it. 除了上面提到的Sun教程之外,你可以查看关于Java和XML的DZone Refcardz ,我发现它是一个很好的,简洁的解释如何做到这一点。
But well, there is probably plenty of Web resources on the topic, including on this very site. 但是,关于该主题可能有大量的Web资源,包括在这个网站上。

You can apply an XSLT stylesheet to extract the desired content. 您可以应用XSLT样式表来提取所需的内容。

This stylesheet should fit your example: 此样式表应该适合您的示例:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/Response/aa/Fromhere/*">
        <xsl:copy>
            <xsl:apply-templates/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Apply it with something like the following (exception handling not included): 使用以下内容应用它(不包括异常处理):

String xml = "<Response> <aa> <Fromhere> <a1>Content</a1> <a2>Content</a2> </Fromhere> </aa> </Response>";
Source xsl = new StreamSource(new FileReader("/path/to/file.xsl");

TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer(xsl);
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");

StringWriter out = new StringWriter();
transformer.transform(new StreamSource(new StringReader(xml)), new StreamResult(out));

System.out.println(out.toString());

This should work with any version of Java starting with 1.4. 这适用于从1.4开始的任何Java版本。

This should work 这应该工作

import java.util.regex.*

Pattern p = Pattern.compile("<Fromhere>.*</Fromhere>");
Matcher m = p.matcher(responseString);
String whatYouWant = m.group();

It would be a little more verbose to use Scanner, but that could work too. 使用Scanner会更加冗长,但这也可能有用。

Whether this is a good idea is for someone more experienced than I. 对于比我更有经验的人来说这是否是一个好主意。

One option is to use a StreamFilter : 一种选择是使用StreamFilter

class MyFilter implements StreamFilter {
  private boolean on;

  @Override
  public boolean accept(XMLStreamReader reader) {
    final String element = "Fromhere";
    if (reader.isStartElement() && element.equals(reader.getLocalName())) {
      on = true;
    } else if (reader.isEndElement()
        && element.equals(reader.getLocalName())) {
      on = false;
      return true;
    }
    return on;
  }
}

Combined with a Transformer , you can use this to safely parse logically-equivalent markup like this: 结合Transformer ,您可以使用它来安全地解析逻辑等效的标记,如下所示:

<Response>
  <!-- <Fromhere></Fromhere> -->
  <aa>
    <Fromhere>
      <a1>Content</a1> <a2>Content</a2>
    </Fromhere>
  </aa>
</Response>

Demo: 演示:

StringWriter writer = new StringWriter();

XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = inputFactory
    .createXMLStreamReader(new StringReader(xmlString));
reader = inputFactory.createFilteredReader(reader, new MyFilter());
TransformerFactory transFactory = TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new StAXSource(reader), new StreamResult(writer));

System.out.println(writer.toString());

This is a programmatic variation on Massimiliano Fliri's approach. 这是Massimiliano Fliri方法的程序化变体。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM