How do I convert a unicode string to turkish in java?

Question

Hi, I want to convert the unicode value "\₺" to the Turkish equivalent string. Can anybody help me please?

I used the following code:

try {
  String string = "\u20BA";
  System.out.println(string + " " + string.toLowerCase());
  // Locale.setDefault(new Locale("tr"));
  // Locale tr = new Locale("TR","tr");
  byte[] converttoBytes = string.toLowerCase().getBytes("UTF-8");
  string = new String(converttoBytes, "Cp1254");
  System.out.println(string + " " + string.toLowerCase());
} catch (Exception e) {
 e.printStackTrace();
}

Answer 1

Think of a String in Java as a sequence of characters independent of any character encoding. It therefore does not make sense to speak about changing the encoding of a String .

Character encodings only come to play if you convert between characters and bytes. This usually happens when you read or write characters from/to a Stream of bytes (for example a file). If you don't specify the encoding explicitly the platform encoding gets used.

In case of difficulties make sure your platform encoding is set correctly or specify the correct encoding explicitly.

Answer 2

The key is that you're specifying the code point for an individual character, but you're using that code point as the input to a String object, so Java's interpreting it as 6 separate characters. Try this for your specific question:

StringBuilder sb = new StringBuilder();
sb.append('\u20BA');
System.out.println(sb.toString());

Note that the Unicode value is in single quotes - a single character value. As you may have guessed, you can continue appending other Unicode values in this way to create a string...however, as has been mentioned, this might not be the best answer to whatever underlying problem you're working on.

Answer 3

The lira sign (u+20BA) was created in 2012 and both CP1254 and ISO-8859-9 character set doesn't have the lira sign included.

This can be proven on Linux with the following set of commands (u+20BA is actually encoded as the 3 following bytes in utf8: E2 82 BA):

$ echo -e "\xE2\x82\xBA"
₺
$ echo -e "\xE2\x82\xBA" | iconv --from utf8 --to cp1254
iconv: illegal input sequence at position 0
$ echo -e "\xE2\x82\xBA" | iconv --from utf8 --to iso88599
iconv: illegal input sequence at position 0
$ echo -e "\xE2\x82\xBA" | iconv --from utf8 --to cp1254//TRANSLIT
?
$ echo -e "\xE2\x82\xBA" | iconv --from utf8 --to iso88599//TRANSLIT
?

How do I convert a unicode string to turkish in java?

Question

3 answers

solution1
0 2013-10-24 08:07:54

solution2
0 2013-10-25 02:03:22

solution3
0 2016-01-28 20:57:44

How do I convert a unicode string to turkish in java?

Question

3 answers

solution1 0 2013-10-24 08:07:54

solution2 0 2013-10-25 02:03:22

solution3 0 2016-01-28 20:57:44

solution1
0 2013-10-24 08:07:54

solution2
0 2013-10-25 02:03:22

solution3
0 2016-01-28 20:57:44