When I am reading some text from a web page, I get some problems with unicode chars displayed in TextView.
I am retrieving the web content by using the following code :
try {
HttpGet request = new HttpGet();
request.addHeader("User-Agent", USER_AGENT);
request.setURI(new URI(wwwlink));
try {
response4 = httpClient.execute(request);
} catch (ClientProtocolException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
} catch (URISyntaxException e) {e.printStackTrace();}
try {
in2 = null;
String UTF8 = "UTF-8";
in2 = new BufferedReader (new InputStreamReader(response4.getEntity().getContent(),UTF8));
} catch (IllegalStateException e) {Log.i(tag,e.toString());
} catch (IOException e) {Log.i(tag,e.toString());}
The page I am reading has this HTML heading tag :
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
Now the problem is : I read lines and some text that I need contains unicode chars like that :
20 \u00b0C (20 degree symbol C )
I am trying to convert this and display as degree symbol in TextView.
The following is working
textview.settext("\u00b0");
But when I do that, the line contains unicode chars:
line = in2.readln;
textview.settext(line);
TextView will display fe: some text \° some text
I've checked everything with the emulator and a phone.
As your input text contains java representation for unicode, you will need manually substitute such characters to correct ones. Here I give an example how tor replace one single char from the String just to give a rough idea:
String input = "some text \\u00b0 some text";
Scanner scanner = new Scanner(input);
String unicodeCharStr = scanner.findWithinHorizon("\\\\{1}u[0-9a-fA-F]{4}", 0);
char unicodeChar = (char)(int)Integer.valueOf(unicodeCharStr.substring(2, 6), 16);
input = input.replace(unicodeCharStr, unicodeChar+"");
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.