简体   繁体   中英

XML character entity reference

I'm parsing some data from an XML document then write it back to another XML document. I face a problem where the data in the original one is written in CDATA section.

This is an example of the input :

<actions><![CDATA[<div>
check that&#39;s is sent </div>

I simply replaced div , p etc. with substring function, but my output was

<logical>check that &amp;#39; is sent </logical>

I want the content of the output to appear to be the same as the input:

<logical>check that's is sent </logical>

I tried using substring as well, like this:

string= string.replaceAll("&#\\d+;", " 39");

but the problem now is that this number is variable so I need to replace the current regex with the number inside the &#numl;

Also the string may include many numbers so I couldn't just search for a number inside it , something like this:

check that&#39;s is sent and&#42;s is received

I used this function to find all occurrences of the numeric character reference and return it with just the number

public static String decode(String str) {
    StringBuffer sb = new StringBuffer();
    int i1=0;
    int i2=0;

    while(i2<str.length()) {
       i1 = str.indexOf("&#",i2);
       if (i1 == -1 ) {
            sb.append(str.substring(i2));
            break ;
       }
       sb.append(str.substring(i2, i1));
       i2 = str.indexOf(";", i1);
       if (i2 == -1 ) {
            sb.append(str.substring(i1));
            break ;
       }

       String appnd = str.substring(i1+2, i2);




             sb.append(" "+appnd);

        i2++ ;
    }
    return sb.toString();}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM