简体   繁体   English

在Sax XML解析器中获取父子层次结构

[英]Getting Parent Child Hierarchy in Sax XML parser

I'm using SAX (Simple API for XML) to parse an XML document. 我正在使用SAX(XML的简单API)来解析XML文档。 I'm getting output for all the tags the file have, but i want it to show the tags in parent child hierarchy. 我正在获取文件具有的所有标签的输出,但是我希望它在父子层次结构中显示标签。 For Example: This is my output 例如:这是我的输出

<dblp>
<www>
<author>
</author><title>
</title><url>
</url><year>
</year></www><inproceedings>
<month>
</month><pages>
</pages><booktitle>
</booktitle><note>
</note><cdrom>
</cdrom></inproceedings><article>
<journal>
</journal><volume>
</volume></article><ee>
</ee><book>
<publisher>
</publisher><isbn>
</isbn></book><incollection>
<crossref>
</crossref></incollection><editor>
</editor><series>
</series></dblp>

But i want it to display the output like this (it displays the children with extra spacing (that's how i want it to be)) 但是我希望它显示这样的输出(它以额外的间距显示子项(这就是我想要的样子))

<dblp>
  <www>
    <author>
    </author>
    <title>
    </title>
    <url>
    </url>
    <year>
    </year>
  </www>
  <inproceedings>
    <month>
    </month>
    <pages>
    </pages>
    <booktitle>
    </booktitle>
    <note>
    </note>
    <cdrom>
    </cdrom>
  </inproceedings>
  <article>
    <journal>
    </journal>
    <volume>
    </volume>
  </article>
  <ee>
  </ee>
  <book>
    <publisher>
    </publisher>
    <isbn>
    </isbn>
  </book>
  <incollection>
    <crossref>
    </crossref>
  </incollection>
  <editor>
  </editor>
  <series>
  </series>
</dblp>

But i can't figure out how can i detect that parser is parsing a parent tag or a children. 但是我不知道如何检测到解析器正在解析父标签或子标签。

here is my code: 这是我的代码:

package com.teamincredibles.sax;

import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

public class Parser extends DefaultHandler {

  public void getXml() {
    try {
      SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
      SAXParser saxParser = saxParserFactory.newSAXParser();
      final MySet openingTagList = new MySet();
      final MySet closingTagList = new MySet();
      DefaultHandler defaultHandler = new DefaultHandler() {

        public void startDocument() throws SAXException {
          System.out.println("Starting Parsing...\n");
        }

        public void endDocument() throws SAXException {
          System.out.print("\n\nDone Parsing!");
        }

        public void startElement(String uri, String localName, String qName,
          Attributes attributes) throws SAXException {
          if (!openingTagList.contains(qName)) {
            openingTagList.add(qName);
            System.out.print("<" + qName + ">\n");
          }
        }

        public void characters(char ch[], int start, int length)
        throws SAXException {
          /*for(int i=start; i<(start+length);i++){
            System.out.print(ch[i]);
        }*/
        }

        public void endElement(String uri, String localName, String qName)
        throws SAXException {
          if (!closingTagList.contains(qName)) {
            closingTagList.add(qName);
            System.out.print("</" + qName + ">");
          }
        }
      };

      saxParser.parse("xml/sample.xml", defaultHandler);
    } catch (Exception e) {
      e.printStackTrace();
    }
  }

  public static void main(String args[]) {
    Parser readXml = new Parser();
    readXml.getXml();
  }
}

You can consider a StAX implementation: 您可以考虑StAX实现:

package be.duo.stax;

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;

public class StaxExample {

    public void getXml() {
        InputStream is = null;
        try {
            is = new FileInputStream("c:\\dev\\sample.xml");

            XMLInputFactory inputFactory = XMLInputFactory.newInstance();
            XMLStreamReader reader = inputFactory.createXMLStreamReader(is);

            parse(reader, 0);

        } catch(Exception ex) {
            System.out.println(ex.getMessage());
        } finally {
            if(is != null) {
                try {
                    is.close();
                } catch(IOException ioe) {
                    System.out.println(ioe.getMessage());
                }
            }
        }

    }

    private void parse(XMLStreamReader reader, int depth) throws XMLStreamException {
        while(true) {
            if(reader.hasNext()) {
                switch(reader.next()) {
                case XMLStreamConstants.START_ELEMENT:
                    writeBeginTag(reader.getLocalName(), depth);
                    parse(reader, depth+1);
                    break;
                case XMLStreamConstants.END_ELEMENT:
                    writeEndTag(reader.getLocalName(), depth-1);
                    return;
                }
            }
        }
    }

    private void writeBeginTag(String tag, int depth) {
        for(int i = 0; i < depth; i++) {
            System.out.print(" ");
        }
        System.out.println("<" + tag + ">");
    }

    private void writeEndTag(String tag, int depth) {
        for(int i = 0; i < depth; i++) {
            System.out.print(" ");
        }
        System.out.println("</" + tag + ">");
    }

    public static void main(String[] args) {
        StaxExample app = new StaxExample();
        app.getXml();
    }

}

There is an idiom for StAX with a loop like this for every tag in the XML: StAX有一个习惯用法,对于XML中的每个标签,都有这样的循环:

private MyTagObject parseMyTag(XMLStreamReader reader, String myTag) throws XMLStreamException {
    MyTagObject myTagObject = new MyTagObject();
    while (true) {
        switch (reader.next()) {
        case XMLStreamConstants.START_ELEMENT:
            String localName = reader.getLocalName();
            if(localName.equals("myOtherTag1")) {
                myTagObject.setMyOtherTag1(parseMyOtherTag1(reader, localName));
            } else if(localName.equals("myOtherTag2")) {
                myTagObject.setMyOtherTag2(parseMyOtherTag2(reader, localName));
            }
            // and so on
            break;
        case XMLStreamConstants.END_ELEMENT:
            if(reader.getLocalName().equals(myTag) {
                return myTagObject;
            }
            break;
    }
}

well what have you tried? 那么你尝试了什么? you should use a transformer found here: How to pretty print XML from Java? 您应该使用在这里找到的转换器: 如何从Java漂亮地打印XML?

Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

Almost any useful SAX application needs to maintain a stack. 几乎所有有用的SAX应用程序都需要维护堆栈。 When startElement is called, you push information to the stack, when endElement is called, you pop the stack. 调用startElement时,将信息推入堆栈,调用endElement时,将弹出堆栈。 Exactly what you put on the stack depends on the application; 放在堆栈中的确切内容取决于应用程序。 it's often the element name. 通常是元素名称。 For your application, you don't actually need a full stack, you only need to know its depth. 对于您的应用程序,您实际上不需要完整的堆栈,只需要知道其深度即可。 You could get by with maintaining this using depth++ in startElement and depth-- in endElement(). 您可以通过在startElement中使用depth++和在endElement()中使用depth--来保持此状态。 Then you just output depth spaces before the element name. 然后,您只需在元素名称之前输出depth空间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM