在Sax XML解析器中获取父子层次结构

Question

I'm using SAX (Simple API for XML) to parse an XML document. 我正在使用SAX（XML的简单API）来解析XML文档。 I'm getting output for all the tags the file have, but i want it to show the tags in parent child hierarchy. 我正在获取文件具有的所有标签的输出，但是我希望它在父子层次结构中显示标签。 For Example: This is my output 例如：这是我的输出

<dblp>
<www>
<author>
</author><title>
</title><url>
</url><year>
</year></www><inproceedings>
<month>
</month><pages>
</pages><booktitle>
</booktitle><note>
</note><cdrom>
</cdrom></inproceedings><article>
<journal>
</journal><volume>
</volume></article><ee>
</ee><book>
<publisher>
</publisher><isbn>
</isbn></book><incollection>
<crossref>
</crossref></incollection><editor>
</editor><series>
</series></dblp>

But i want it to display the output like this (it displays the children with extra spacing (that's how i want it to be)) 但是我希望它显示这样的输出（它以额外的间距显示子项（这就是我想要的样子））

<dblp>
  <www>
    <author>
    </author>
    <title>
    </title>
    <url>
    </url>
    <year>
    </year>
  </www>
  <inproceedings>
    <month>
    </month>
    <pages>
    </pages>
    <booktitle>
    </booktitle>
    <note>
    </note>
    <cdrom>
    </cdrom>
  </inproceedings>
  <article>
    <journal>
    </journal>
    <volume>
    </volume>
  </article>
  <ee>
  </ee>
  <book>
    <publisher>
    </publisher>
    <isbn>
    </isbn>
  </book>
  <incollection>
    <crossref>
    </crossref>
  </incollection>
  <editor>
  </editor>
  <series>
  </series>
</dblp>

But i can't figure out how can i detect that parser is parsing a parent tag or a children. 但是我不知道如何检测到解析器正在解析父标签或子标签。

here is my code: 这是我的代码：

package com.teamincredibles.sax;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class Parser extends DefaultHandler {

  public void getXml() {
    try {
      SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
      SAXParser saxParser = saxParserFactory.newSAXParser();
      final MySet openingTagList = new MySet();
      final MySet closingTagList = new MySet();
      DefaultHandler defaultHandler = new DefaultHandler() {

        public void startDocument() throws SAXException {
          System.out.println("Starting Parsing...\n");
        }

        public void endDocument() throws SAXException {
          System.out.print("\n\nDone Parsing!");
        }

        public void startElement(String uri, String localName, String qName,
          Attributes attributes) throws SAXException {
          if (!openingTagList.contains(qName)) {
            openingTagList.add(qName);
            System.out.print("<" + qName + ">\n");
          }
        }

        public void characters(char ch[], int start, int length)
        throws SAXException {
          /*for(int i=start; i<(start+length);i++){
            System.out.print(ch[i]);
        }*/
        }

        public void endElement(String uri, String localName, String qName)
        throws SAXException {
          if (!closingTagList.contains(qName)) {
            closingTagList.add(qName);
            System.out.print("</" + qName + ">");
          }
        }
      };

      saxParser.parse("xml/sample.xml", defaultHandler);
    } catch (Exception e) {
      e.printStackTrace();
    }
  }

  public static void main(String args[]) {
    Parser readXml = new Parser();
    readXml.getXml();
  }
}

Answer 1

You can consider a StAX implementation: 您可以考虑StAX实现：

package be.duo.stax;

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

public class StaxExample {

    public void getXml() {
        InputStream is = null;
        try {
            is = new FileInputStream("c:\\dev\\sample.xml");

            XMLInputFactory inputFactory = XMLInputFactory.newInstance();
            XMLStreamReader reader = inputFactory.createXMLStreamReader(is);

            parse(reader, 0);

        } catch(Exception ex) {
            System.out.println(ex.getMessage());
        } finally {
            if(is != null) {
                try {
                    is.close();
                } catch(IOException ioe) {
                    System.out.println(ioe.getMessage());
                }
            }
        }

    }

    private void parse(XMLStreamReader reader, int depth) throws XMLStreamException {
        while(true) {
            if(reader.hasNext()) {
                switch(reader.next()) {
                case XMLStreamConstants.START_ELEMENT:
                    writeBeginTag(reader.getLocalName(), depth);
                    parse(reader, depth+1);
                    break;
                case XMLStreamConstants.END_ELEMENT:
                    writeEndTag(reader.getLocalName(), depth-1);
                    return;
                }
            }
        }
    }

    private void writeBeginTag(String tag, int depth) {
        for(int i = 0; i < depth; i++) {
            System.out.print(" ");
        }
        System.out.println("<" + tag + ">");
    }

    private void writeEndTag(String tag, int depth) {
        for(int i = 0; i < depth; i++) {
            System.out.print(" ");
        }
        System.out.println("</" + tag + ">");
    }

    public static void main(String[] args) {
        StaxExample app = new StaxExample();
        app.getXml();
    }

}

There is an idiom for StAX with a loop like this for every tag in the XML: StAX有一个习惯用法，对于XML中的每个标签，都有这样的循环：

private MyTagObject parseMyTag(XMLStreamReader reader, String myTag) throws XMLStreamException {
    MyTagObject myTagObject = new MyTagObject();
    while (true) {
        switch (reader.next()) {
        case XMLStreamConstants.START_ELEMENT:
            String localName = reader.getLocalName();
            if(localName.equals("myOtherTag1")) {
                myTagObject.setMyOtherTag1(parseMyOtherTag1(reader, localName));
            } else if(localName.equals("myOtherTag2")) {
                myTagObject.setMyOtherTag2(parseMyOtherTag2(reader, localName));
            }
            // and so on
            break;
        case XMLStreamConstants.END_ELEMENT:
            if(reader.getLocalName().equals(myTag) {
                return myTagObject;
            }
            break;
    }
}

Answer 2

well what have you tried? 那么你尝试了什么？ you should use a transformer found here: How to pretty print XML from Java? 您应该使用在这里找到的转换器：如何从Java漂亮地打印XML？

Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

Answer 3

Almost any useful SAX application needs to maintain a stack. 几乎所有有用的SAX应用程序都需要维护堆栈。 When startElement is called, you push information to the stack, when endElement is called, you pop the stack. 调用startElement时，将信息推入堆栈，调用endElement时，将弹出堆栈。 Exactly what you put on the stack depends on the application; 放在堆栈中的确切内容取决于应用程序。 it's often the element name. 通常是元素名称。 For your application, you don't actually need a full stack, you only need to know its depth. 对于您的应用程序，您实际上不需要完整的堆栈，只需要知道其深度即可。 You could get by with maintaining this using depth++ in startElement and depth-- in endElement(). 您可以通过在startElement中使用depth++和在endElement（）中使用depth--来保持此状态。 Then you just output depth spaces before the element name. 然后，您只需在元素名称之前输出depth空间。

在Sax XML解析器中获取父子层次结构

问题描述

3 个解决方案

解决方案1
1 已采纳 2015-04-02 20:51:29

解决方案2
0 2015-03-31 05:16:49

解决方案3
0 2015-03-31 07:48:02

在Sax XML解析器中获取父子层次结构

问题描述

3 个解决方案

解决方案1 1 已采纳 2015-04-02 20:51:29

解决方案2 0 2015-03-31 05:16:49

解决方案3 0 2015-03-31 07:48:02

解决方案1
1 已采纳 2015-04-02 20:51:29

解决方案2
0 2015-03-31 05:16:49

解决方案3
0 2015-03-31 07:48:02