简体   繁体   English

双编码 UTF-8 字符串 - MySql、Hibernate

[英]Double Encoded UTF-8 String - MySql, Hibernate

We are saving a string in MySQL DB after encoding it using Base64 using hibernate.我们在使用 Hibernate 使用 Base64 编码后将字符串保存在 MySQL DB 中。

Following is the code that does this:以下是执行此操作的代码:

    @Basic
    @Column(name = "name", nullable = false)
    @ColumnTransformer(read = "FROM_BASE64(name) ", write ="TO_BASE64(?)")
    public String getName()

Now, when I am saving rotebühlstr , this is getting saved in DB as cm90ZWLDvGhsc3Ry .现在,当我保存rotebühlstr 时,它会以cm90ZWLDvGhsc3Ry 的形式保存在数据库中。 When I print it on terminal, this is shown as rotebühlstr where as it should be rotebühlstr当我在终端上打印它时,它显示为rotebühlstr ,而它应该是rotebühlstr

This is a dropwizard project and config.yaml for mysql connection is as follow:这是一个dropwizard项目,mysql连接的config.yaml如下:

      properties:
      charSet: UTF-8
      characterEncoding: UTF-8
      useUnicode: true
      hibernate.dialect: org.hibernate.dialect.MySQL5InnoDBDialect
      hibernate.jdbc.batch_size: 100
      hibernate.envers.audit_table_suffix: "_aud"
      hibernate.id.new_generator_mappings: false

MySQL column description : name varchar(200) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL, MySQL 列描述: name varchar(200) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL,

    mysql> show variables like 'character_set_%';
    +--------------------------+--------------------------------------- 
    --------------------+
    | Variable_name            | Value                                                     
    |
    +--------------------------+--------------------------------------- 
    --------------------+
    | character_set_client     | utf8mb4                                                   
    |
    | character_set_connection | utf8mb4                                                   
    |
    | character_set_database   | utf8mb4                                                   
    |
    | character_set_filesystem | binary                                                    
    |
    | character_set_results    | utf8mb4                                                   
    |
    | character_set_server     | latin1                                                    
    |
    | character_set_system     | utf8                                                      
    |
    | character_sets_dir       | /usr/local/mysql-5.7.23-macos10.13- 
    x86_64/share/charsets/ |
    +--------------------------+--------------------------------------- 
    --------------------+
    8 rows in set (0.01 sec)

Observation:观察:

In my colleague's local set up, this is working fine.在我同事的本地设置中,这工作正常。 There, java/hibernate is treating input string in latin1 and not in UTF-8.在那里,java/hibernate 正在处理 latin1 而不是 UTF-8 中的输入字符串。 so, rotebühlstr is encoded in DB as cm90ZWL8aGxzdHI= and decoded correctly as rotebühlstr .因此, rotebühlstr在 DB 中编码为cm90ZWL8aGxzdHI=并正确解码为rotebühlstr

^This was happening because of difference in character_set_server. ^这是由于 character_set_server 的差异而发生的。 It was set as latin1 in my local and as utf-8 in colleague's local.它在我的本地设置为 latin1,在同事本地设置为 utf-8。

What we have tried so far:到目前为止我们尝试过的:

  • Tried sending "Accept-Charset" as latin1.尝试将“接受字符集”作为 latin1 发送。 Didn't work.没用。
  • Tinkered with config.yaml file to change charset setting for mysql.修改 config.yaml 文件以更改 mysql 的字符集设置。 Didnt work没用

What I can do now:我现在可以做什么:

I can may be write a wrapper layer for encoding and decoding and stop using @ColumnTransformer.我可以编写一个用于编码和解码的包装层并停止使用@ColumnTransformer。 That way problem can be fixed.这样问题就可以解决了。

Thanks.谢谢。

I had a problem with charsets one time and the only charset could fix the problem was utf8mb4.我有一次遇到字符集问题,唯一可以解决问题的字符集是 utf8mb4。 As I can remember The problem arises from that utf8 cannot support some characters.我记得问题出在 utf8 不能支持某些字符。

Additionally, for more information, you can check https://stackoverflow.com/a/43692337/2137378 too.此外,有关更多信息,您也可以查看https://stackoverflow.com/a/43692337/2137378

It works on your colleague, but not on your terminal, because yours is connected with latin1 charset, even though the database and field might be in utf8mb4.它适用于您的同事,但不适用于您的终端,因为您的与 latin1 字符集相连,即使数据库和字段可能在 utf8mb4 中。

You want to find your mysql config and add these options in their sections.您想找到您的 mysql 配置并在它们的部分中添加这些选项。 Create the sections if they're missing.如果它们丢失,则创建这些部分。

[mysql]
default-character-set=utf8mb4

[client]
default-character-set=utf8mb4

[mysqld]
character_set_server = utf8mb4
collation_server = utf8mb4_general_ci

Exit any clients, restart the server and you should be fine.退出任何客户端,重新启动服务器,你应该没问题。 Also from now on when you do a show create database or show create table you will see when it's the wrong encoding.从现在开始,当您执行show create databaseshow create table您将看到编码错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM