简体   繁体   English

android unicode到可读的字符串

[英]android unicode to readable string

When I am reading some text from a web page, I get some problems with unicode chars displayed in TextView. 当我从网页上读取一些文本时,在TextView中显示的Unicode字符会出现一些问题。

I am retrieving the web content by using the following code : 我通过使用以下代码来检索Web内容:

try {
    HttpGet request = new HttpGet();
    request.addHeader("User-Agent", USER_AGENT);
    request.setURI(new URI(wwwlink));
    try {
        response4 = httpClient.execute(request);
    } catch (ClientProtocolException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
} catch (URISyntaxException e) {e.printStackTrace();}   
try {
    in2 = null;
    String UTF8 = "UTF-8";
    in2 = new BufferedReader (new InputStreamReader(response4.getEntity().getContent(),UTF8));
} catch (IllegalStateException e) {Log.i(tag,e.toString());
} catch (IOException e) {Log.i(tag,e.toString());}

The page I am reading has this HTML heading tag : 我正在阅读的页面具有以下HTML标题标签:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>

Now the problem is : I read lines and some text that I need contains unicode chars like that : 现在的问题是:我读了几行,并且我需要的一些文本包含这样的unicode字符:

20 \u00b0C (20 degree symbol C )

I am trying to convert this and display as degree symbol in TextView. 我试图将其转换并显示为TextView中的度数符号。

The following is working 以下正在工作

textview.settext("\u00b0");

But when I do that, the line contains unicode chars: 但是,当我这样做时,该行包含unicode字符:

line = in2.readln;
textview.settext(line);

TextView will display fe: some text \° some text TextView将显示fe: some text \° some text

I've checked everything with the emulator and a phone. 我已经用模拟器和电话检查了所有内容。

As your input text contains java representation for unicode, you will need manually substitute such characters to correct ones. 由于您的输入文本包含Unicode的Java表示形式,因此您将需要手动替换此类字符以进行更正。 Here I give an example how tor replace one single char from the String just to give a rough idea: 在这里,我举一个例子,说明如何从字符串中替换一个字符,以给出一个大概的想法:

    String input = "some text \\u00b0 some text";
    Scanner scanner =  new Scanner(input);
    String unicodeCharStr = scanner.findWithinHorizon("\\\\{1}u[0-9a-fA-F]{4}", 0);
    char unicodeChar = (char)(int)Integer.valueOf(unicodeCharStr.substring(2, 6), 16);
    input = input.replace(unicodeCharStr, unicodeChar+"");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM