简体   繁体   中英

XML not parsing with Escape character

I am trying to write a simple SAX parser, I am receiving the inputs from a Web service response, and it includes escape characters < and > When I am trying to parse it using my code, I am getting Reference is not allowed in prolog. Error, where as if I change the escape characters to normal < and > character it is parsing without any issues, I guess I am just missing something very simple here... can somebody please help?

import java.io.ByteArrayInputStream;
import java.io.FileReader;
import java.io.InputStream;

import org.xml.sax.XMLReader;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;

public class Test extends DefaultHandler {

    public static void main(String args[]) throws Exception {
        XMLReader xr = XMLReaderFactory.createXMLReader();
        Test handler = new Test();
        xr.setContentHandler(handler);
        xr.setErrorHandler(handler);

        String xml_string = "&lt;rootnode&gt;&lt;a&gt;hello&lt;/a&gt;&lt;b&gt;world&lt;/b&gt;&lt;/rootnode&gt;";
        InputStream xmlStream = new ByteArrayInputStream(xml_string.getBytes("UTF-8"));
        xr.parse(new InputSource(xmlStream));
    }

    public Test() {
        super();
    }

    ////////////////////////////////////////////////////////////////////
    // Event handlers.
    ////////////////////////////////////////////////////////////////////

    public void startDocument() {
        System.out.println("Start document");
    }

    public void endDocument() {
        System.out.println("End document");
    }

    public void startElement(String uri, String name, String qName, Attributes atts) {
        if ("".equals(uri))
            System.out.println("Start element: " + qName);
        else
            System.out.println("Start element: {" + uri + "}" + name);
    }

    public void endElement(String uri, String name, String qName) {
        if ("".equals(uri))
            System.out.println("End element: " + qName);
        else
            System.out.println("End element:   {" + uri + "}" + name);
    }

    public void characters(char ch[], int start, int length) {
        System.out.print("Characters:    \"");
        for (int i = start; i < start + length; i++) {
            switch (ch[i]) {
            case '\\':
                System.out.print("\\\\");
                break;
            case '"':
                System.out.print("\\\"");
                break;
            case '\n':
                System.out.print("\\n");
                break;
            case '\r':
                System.out.print("\\r");
                break;
            case '\t':
                System.out.print("\\t");
                break;
            default:
                System.out.print(ch[i]);
                break;
            }
        }
        System.out.print("\"\n");
    }
}

You shouldn't be using escape characters in your xmlstring . You need to use < and > for the xml tags. Only escape them when you need to include < or > as part of the content of an attribute/element not the element tag itself.

For normal tags one should use < and > like < root >...< / root >. Only in real text < and > should be escaped to & lt ; and & gt ;.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM