简体   繁体   中英

java convert utf-8 2 byte char to 1 byte char

There are many similar questions, but no one helped me.

utf-8 can be 1 byte or 2,3,4.

ISO-8859-15 is allways 2 bytes.

But I need 1 byte character like code page Code "page 863" (IBM863).

http://en.wikipedia.org/wiki/Code_page_863

For example "é" is code point 233 and is 2 bytes long in utf 8, how can I convert it to IBM863 (1 byte) in Java?

Running on JVM -Dfile.encoding=UTF-8 possible?

Of course that conversion would mean that some characters can be lost, because IBM863 is smaller. But I need the language specific characters, like french, è, é etc.

Edit1:

 String text = "text with é";

 Socket socket = getPrinterSocket( printer);
 BufferedWriter bwOut = getPrinterWriter(printer,socket);
 ...
 bwOut.write("PRTXT \"" + text + "\n");
 ...
 if (socket != null)
 {
            bwOut.close();
            socket.close();
 }
 else
 {
            bwOut.flush();
 }

Its going a label printer with Fingerprint 8.2.

Edit 2:

private BufferedWriter getPrinterWriter(PrinterLocal printer, Socket socket)
throws IOException
{
        return new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
}

First of all: there is no such thing as "1 byte char" or, in fact, "n byte char" for whatever n.

In Java, a char is a UTF-16 code unit; depending on the (Unicode) code point, either one, or two char s, are necessary to represent a code point.

You can use the following methods:

You obtain the two latter from a Charset 's .new{Encoder,Decoder}() methods.

It is crucially important here to know what your input is exactly: is it a code point, is it an encoded byte array? You'll have to adapt your code depending on this.

Final note: the file.encoding setting defines the default charset to use when you don't specify a charset to use, for instance in a FileReader constructors; you should avoid not specifying a charset to begin with!

byte[] someUtf8Bytes = ...
String decoded = new String(someUtf8Bytes, StandardCharsets.UTF8);
byte[] someIso15Bytes = decoded.getBytes("ISO-8859-15");
byte[] someCp863Bytes = decoded.getBytes("cp863");

If you start with a string, use just getBytes with a proper encoding.

If you want to write strings with a proper encoding to a socket, you can either use OutputStream instead of PrintStream or Writer and send byte arrays, or you can do:

new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), "cp863"))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM