简体   繁体   中英

Wrong String encoding using JDBC Oracle Thin driver

I am using an Oracle database with ISO-8859-1 data. When I try to get String from this DB using ResultSet and print result to console, I get a wrong encoding output.

Locale.getDefault(); // -> fr_FR
Charset.defaultCharset(); // -> UTF-8

But I tried to print these data from my ResultSet :

rs.getString("MY_COL"); // direct from ResultSet
new String(rs.getString("MY_COL").getBytes(Charset.forName("ISO-8859-15")), Charset.forName("UTF-8")); // convert ISO bytes to UTF-8 bytes

This output :

générale
générale

So, why Oracle JDBC driver create String with ISO-8859-1 bytes encoding ? How can I get String with UTF-8 bytes encoding without altering database (nor converting String) ? Can I change it from the driver configuration ou JMV args ?

I guess your database is not in ISO 8859-1 (NLS_CHARACTERSET = WE8ISO8859P1).

On the database

create table foo (col1 varchar2(40));
insert into foo values('é');
insert into foo values(chr(233));
select dump(col1) from foo;

should return

Typ=1 Len=1: 233 
Typ=1 Len=1: 233 

If you get for example

Typ=1 Len=2: 195,169
Typ=1 Len=1: 233

then your database is set up for UTF8 (NLS_CHARACTERSET = AL32UTF8).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM