Formatting Web Service Response

Question

I use the below function to retrieve the web service response:

private String getSoapResponse (String url, String host, String encoding, String soapAction, String soapRequest) throws MalformedURLException, IOException, Exception {         
    URL wsUrl = new URL(url);     
    URLConnection connection = wsUrl.openConnection();     
    HttpURLConnection httpConn = (HttpURLConnection)connection;     
    ByteArrayOutputStream bout = new ByteArrayOutputStream(); 

    byte[] buffer = new byte[soapRequest.length()];     
    buffer = soapRequest.getBytes();     
    bout.write(buffer);     
    byte[] b = bout.toByteArray();          

    httpConn.setRequestMethod("POST");
    httpConn.setRequestProperty("Host", host);

    if (encoding == null || encoding == "")
        encoding = UTF8;

    httpConn.setRequestProperty("Content-Type", "text/xml; charset=" + encoding);
    httpConn.setRequestProperty("Content-Length", String.valueOf(b.length));
    httpConn.setRequestProperty("SOAPAction", soapAction);

    httpConn.setDoOutput(true);
    httpConn.setDoInput(true);

    OutputStream out = httpConn.getOutputStream();
    out.write(b); 
    out.close();

    InputStreamReader is = new InputStreamReader(httpConn.getInputStream());
    StringBuilder sb = new StringBuilder();
    BufferedReader br = new BufferedReader(is);
    String read = br.readLine();

    while(read != null) {
        sb.append(read);
        read = br.readLine();
    }

    String response = decodeHtmlEntityCharacters(sb.toString());    

    return response = decodeHtmlEntityCharacters(response);
}

But my problem with this code is it returns lots of special characters and makes the structure of the XML invalid.
Example response:

&lt;PLANT&gt;A565&lt;/PLANT&gt;
          &lt;PLANT&gt;A567&lt;/PLANT&gt;
          &lt;PLANT&gt;A585&lt;/PLANT&gt;
          &lt;PLANT&gt;A921&lt;/PLANT&gt;
          &lt;PLANT&gt;A938&lt;/PLANT&gt;
        &lt;/PLANT_GROUP&gt;
      &lt;/KPI_PLANT_GROUP_KEYWORD&gt;
      &lt;MSU_CUSTOMERS/&gt;
    &lt;/DU&gt;
    &lt;DU&gt;

So to solve this, I use the below method and pass the whole response to replace all the special characters with its corresponding punctuation.

private final static Hashtable htmlEntitiesTable = new Hashtable();
static {
    htmlEntitiesTable.put("&amp;","&");
    htmlEntitiesTable.put("&quot;","\"");
    htmlEntitiesTable.put("&lt;","<");
    htmlEntitiesTable.put("&gt;",">");  
}

private String decodeHtmlEntityCharacters(String inputString) throws Exception {
    Enumeration en = htmlEntitiesTable.keys();

    while(en.hasMoreElements()){
        String key = (String)en.nextElement();
        String val = (String)htmlEntitiesTable.get(key);

        inputString = inputString.replaceAll(key, val);
    }

    return inputString;
}

But another problem arised. If the response contains this segment <VALUE>< 0.5 </VALUE< <VALUE>< 0.5 </VALUE< and if this will be evaluated by the method, the output would be:

<VALUE>< 0.5</VALUE>

Which makes the structure of the XML invalid again. The data is correct and valid "< 0.5" but having it within the VALUE elements causes issue on the structure of the XML.

Can you please help how to deal with this? Maybe the way I get or build the response can be improved. Is there any better way to call and get the response from web service?

How can I deal with elements containing "<" or ">"?

Answer 1

Do you know how to use a third-party open source library?

You should try using apache commons-lang:

StringEscapeUtils.unescapeXml(xml)

More detail is provided in the following stack overflow post:

how to unescape XML in java

Documentation:

http://commons.apache.org/proper/commons-lang/javadocs/api-release/index.html http://commons.apache.org/proper/commons-lang/userguide.html#lang3 .

Answer 2

You're using SOAP wrong.

In particular, you do not need the following line of code:

     String response = decodeHtmlEntityCharacters(sb.toString());

Just return sb.toString() . And for $DEITY's sake, do not use string methods to parse the retrieved string, use an XML parser, or a full-blown SOAP stack...

Answer 3

Does the > or < character always appear at the beginning of a value? Then you could use regex to handle the cases in which the > or < are followed by a digit (or dot, for that matter).

Sample code, assuming the replacement strings used in it don't appear anywhere else in the XML:

private String decodeHtmlEntityCharacters(String inputString) throws Exception {
    Enumeration en = htmlEntitiesTable.keys();

    // Replaces &gt; or &lt; followed by dot or digit (while keeping the dot/digit)
    inputString = inputString.replaceAll("&gt;(\\.?\\d)", "Valuegreaterthan$1");
    inputString = inputString.replaceAll("&lt;(\\.?\\d)", "Valuelesserthan$1");

    while(en.hasMoreElements()){
        String key = (String)en.nextElement();
        String val = (String)htmlEntitiesTable.get(key);

        inputString = inputString.replaceAll(key, val);
    }

    inputString = inputString.replaceAll("Valuelesserthan", "&lt;");
    inputString = inputString.replaceAll("Valuegreaterthan", "&gt;");

    return inputString;
}

Note the most appropriate answer (and easier for everyone) would be to correctly encode the XML at the sender side (it would also render my solution non-working BTW).

Answer 4

It would be hard to cope with all the situations but you could cover the most common ones by adding a few more rules by assuming that any less than followed by a space is data, and a greater than that has a space in front of it is data and need to be encoded again.

private final static Hashtable htmlEntitiesTable = new Hashtable();
static {
    htmlEntitiesTable.put("&amp;","&");
    htmlEntitiesTable.put("&quot;","\"");
    htmlEntitiesTable.put("&lt;","<");
    htmlEntitiesTable.put("&gt;",">");  
}

private String decodeHtmlEntityCharacters(String inputString) throws Exception {
    Enumeration en = htmlEntitiesTable.keys();

    while(en.hasMoreElements()){
        String key = (String)en.nextElement();
        String val = (String)htmlEntitiesTable.get(key);

        inputString = inputString.replaceAll(key, val);
    }

    inputString = inputString.replaceAll("< ","&lt; ");       
    inputString = inputString.replaceAll(" >"," &gt;");       

    return inputString;
}

Answer 5

'>' is not escaped in XML. So you shouldn't have an issue with that. Regarding '<', here are the options I can think of.

Use CDATA in web response for text containing special characters.
Rewrite the text by reversing the order. For eg. if it is x < 2, change it to 2 > x. '>' is not escaped unless its a part of CDATA.
Use another attribute or element in the XML response to indicate '<' or '>'.
Use regular expression to find a sequence that starts with '<' and followed by a string, and followed by '<' of the closing tag. And replace it with some code or some value that you can interpret and replace later.

Also, you don't need to do this:

String response = decodeHtmlEntityCharacters(sb.toString());

You should be able to parse the XML after you take care of the '<' sign in text.

You can use this site for testing regular expressions.

Answer 6

Why not serialize your xml?, its much easier than what you are doing.

for an example:

var ser = new XmlSerializer(typeof(MyXMLObject));
using (var reader = XmlReader.Create("http.....xml"))
{
     MyXMLObject _myobj = (response)ser.Deserialize(reader);
}

Formatting Web Service Response

Question

6 answers

solution1
3 2013-10-28 19:53:57

solution2
3 2013-10-29 17:46:42

solution3
1 2013-10-28 18:52:51

solution4
0 2013-10-24 22:00:46

solution5
0 2013-10-29 19:16:31

solution6
0 2013-10-30 01:18:01

Formatting Web Service Response

Question

6 answers

solution1 3 2013-10-28 19:53:57

solution2 3 2013-10-29 17:46:42

solution3 1 2013-10-28 18:52:51

solution4 0 2013-10-24 22:00:46

solution5 0 2013-10-29 19:16:31

solution6 0 2013-10-30 01:18:01

solution1
3 2013-10-28 19:53:57

solution2
3 2013-10-29 17:46:42

solution3
1 2013-10-28 18:52:51

solution4
0 2013-10-24 22:00:46

solution5
0 2013-10-29 19:16:31

solution6
0 2013-10-30 01:18:01