简体   繁体   中英

Reading escape characters with XMLStreamReader

Hi I have a problem reading escape characters inside an xml using XMLStreamReader .

for instance I have this element :

<a>foo&amp;bar</a>

and when I read the value, everything after the &amp; is truncated, and the value I get is "foo"

Any ideas how that could be fixed ?

To force XMLStreamReader to return a single string, you have to set the javax.xml.stream.isCoalescing property as indicated by the XMLStreamReader#next() documentation :

XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty("javax.xml.stream.isCoalescing", true);  // decode entities into one string
XMLStreamReader xmlStreamReader = factory.createXMLStreamReader(stringReader);

I'm not sure what the problem is - my test produces the results you expect.

Running

XMLInputFactory xmlInputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = xmlInputFactory.createXMLStreamReader(
     new StringReader("<tag>foo&amp;bar</tag>"));
PrintWriter pw = new PrintWriter(System.out, true);
while (reader.hasNext())
{
    reader.next();
    pw.print(reader.getEventType());
    if (reader.hasText())
        pw.append(' ').append(reader.getText());
    pw.println();
}

Produces

1
4 foo
4 &
4 bar
2
8

On JDK 1.6.0.11 - rather old I know. I'll upgrade and post back if results differ.

One thing to bear in mind is that the XMLStreamReader can (and does!) break up character data into several blocks, as you see above - the repeated 4 events (4=CHARACTERS) indicates the text of the element is sent as 3 events.

After reading the string , Is there any way to keep & in next output ? like:

1
4 foo&amp;bar
2
8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM