I am trying to do some experiment with the org.apache.commons.lang.StringEscapeUtils class but I am finding some difficulties.
I have the following situation in my code:
String notNormalized = "c'è";
System.out.println("NOT NORMALIZED: " + notNormalized);
System.out.println("NORMALIZED: " + StringEscapeUtils.escapeJava(notNormalized));
So first I have declared the notNormalized field that (at least in my head) have to represent a not normalized string that contains an apostrophe character represented by the ' and an accented vowel represented by the è (that should be the è character)
Then I try to print it without normalization and I espect that is print the c'è string and the its normalized version and I expect to retrieve the c'è normalized\\converted string.
But the problem is that I still obtain the same output, infact this is what I obtain in the console as output:
NOT NORMALIZED: c'è
NORMALIZED: c'è
Why? What am I missing? What is wrong? How can I perform this test and correctly convert a string that contains character as &apos ?
What you're looking to do is unescapeHtml4
.
So
System.out.println("NORMALIZED: " + StringEscapeUtils.unescapeHtml4(notNormalized));
which prints
NORMALIZED: c'è
Unfortunately, &apos
is not an HTML 4 entity and therefore can't be unescaped with this tool. You can use unescapeXml
for the &apos
but not for the è
. You'll have to mix and match.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.