简体   繁体   中英

Java Removing Characters from XML

I'm reading XML data using Java and DOM. When I print a variable to the console I notice it prints on two different lines.

Output:

Hello How are
you today?

When I go to the attribute I'm trying to print of the underlying XML document, I notice the following:

<element attribute = "Hello How are&#xD;&#xA;you today?"></element>

How do I remove the characters &#xD;&#xA; from the attribute value in Java?

If the data from the attribute is stored in a Java String variable called myVar, I tried the following unsuccessfully:

if(myVar.contains("&#xD;&#xA;")){

    myVar = myVar.replaceAll("&#xD;&#xA;", " ");

}

&#xD;&#xA; this is a line break embedded in XML, which is probably converted into characters 0xD 0xA (13 10) in java. So for the pattern, either use "\\n\\r", or use "\\s+" -> " "

replaceAll("\\\\s+", " ") worked but so did replaceAll("\\r\\n", " ") . On the other hand, "\\n\\r" as first argument to replaceAll did not work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM