简体   繁体   English

如何使用Java将特殊字符插入MySQL

[英]How to insert special characters into MySQL using Java

I have several csv files in am loading into MySQl using Java. 我正在使用Java将多个csv文件加载到MySQl中。 In the Description field I have several Special Characters that are causing the load to fail. 在“描述”字段中,我有几个导致加载失败的特殊字符。 I am using LOAD DATA INFILE as seen in the code block below. 我正在使用LOAD DATA INFILE,如下面的代码块所示。 This is nested in a for each loop which parses an array of filenames / tables and runs through each combination until it is finished with all the files. 它嵌套在for每个循环中,该循环分析文件名/表的数组并遍历每种组合,直到完成所有文件为止。

Here is my jdbc connection string where I am passing a definitive collation param/value for UTF8 collation 这是我的jdbc连接字符串,我在其中为UTF8排序规则传递了确定的排序规则参数/值

 static String  url = "jdbc:mysql://localhost:3306/iber_stage?verifyServerCertificate=false&characterEncoding=UTF8";

other connection parameters and parsing an array of filenames/tablenames 其他连接参数并解析文件名/表名数组

 final String sql1 = ("TRUNCATE TABLE" + tableName);
 final String sql2 = ("LOAD DATA INFILE" + filetoEat  + "INTO TABLE staging." +tableName + "CHARACTER SET UTF8 FIELDS TERMINATED BY',' ENCLOSED BY '\"\' LINES TERMINATED BY '\n' IGNORE 1 LINES");

        try {
        Class.forName("com.mysql.jdbc.Driver");
        con = DriverManager.getConnection(url, username, password);
        st = con.createStatement();
        st.executeUpdate(sql1);
        rs = st.executeQuery(sql2);

        if (rs.toString() != null) {
            returnMsg = rs.toString();
            System.out.println(returnMsg);        
            updFlag = 0; 
            String strRecs = returnMsg.substring(40);
            updateControlTable(updFlag, strRecs);
        }

        } catch (SQLException ex) {
            Logger lgr = Logger.getLogger(update.class.getName());
            lgr.log(Level.SEVERE, ex.getMessage(), ex);
            updFlag = 1;            

        } catch (ClassNotFoundException e) {
            Logger lgr = Logger.getLogger(update.class.getName());
            lgr.log(Level.SEVERE, e.getMessage(), e);
            e.printStackTrace();
            updFlag = 1;

        } 

The code is working fine until it comes across a special character like a degree symbol or micro symbol µ within a Material Description . 该代码可以正常工作,直到遇到“材料描述”中的特殊字符(如度数符号或微符号µ)为止。 At that point it throws an Exception 那时它抛出一个异常

Invalid utf8 character string: 'LUG'

The string LUG is followed by a µ symbol. 字符串LUG后跟一个µ符号。 The DB is set to utf8 - utf8_unicode_ci and the column in question is a VARCHAR(60) that holds material descriptions. DB设置为utf8- utf8_unicode_ci ,相关utf8_unicode_ci VARCHAR(60),其中包含材料描述。 I have tried using ESCAPED BY '\\\\' but I can't seem to get it working correctly. 我尝试使用ESCAPED BY '\\\\'但似乎无法正常工作。 I have also tried CHARACTER SET UTF8 . 我也尝试过CHARACTER SET UTF8 I have also tried different collation ie, utf8_general_ci to no avail. 我也尝试了不同的排序规则,即utf8_general_ci ,但无济于事。

Any insight is greatly appreciated 非常感谢任何见解

Have you tried adding 您是否尝试过添加

CHARACTER SET UTF8

to the LOAD DATA INFILE instruction? LOAD DATA INFILE指令?

Full doc: http://dev.mysql.com/doc/refman/5.7/en/load-data.html 完整文档: http : //dev.mysql.com/doc/refman/5.7/en/load-data.html

Can you check with database collation utf8_general_ci and character set as utf_8 , It may work for you. 您可以使用数据库排序规则utf8_general_ci和字符集检查为utf_8检查utf_8 ,它可能对您utf_8

As It applies Unicode normalization using language-specific rules. 由于它使用特定于语言的规则应用Unicode规范化。

I figured that I would answer this now that I found the solution. 我发现找到解决方案后便会回答这个问题。 Because I am using Java to run the LOAD DATA INFILE via JDBC the JDBC driver seems to be checking the collation at the DB and not the actual table being loaded as it is parsing the file. 因为我使用Java通过JDBC运行LOAD DATA INFILE ,所以JDBC驱动程序似乎正在检查数据库中的排序规则,而不是在分析文件时正在检查的实际表。 So you can't have the DB set to UTF-8 and have a Latin collated table as you would be able to do with an INSERT statement. 因此,您不能像使用INSERT语句那样将数据库设置为UTF-8,也不能使用拉丁文排序表。 I had tried to set the Table collation as Latin and even had the field in question Latin, but until I changed the entire DB to Latin it was failing. 我曾尝试将Table排序规则设置为Latin,甚至将问题字段设置为Latin,但是直到我将整个数据库更改为Latin之前,它都失败了。 The CSV files are large so checking every char in question is not easy, but I was catching the Exceptions in Java and was able to determine that the error was generated by the JDBC driver and was complaining that "Character at line xx is not a UTF-8 character" Running in Debug allowed me to see more details. CSV文件很大,因此检查每个有问题的字符并不容易,但是我捕获了Java中的异常,并且能够确定该错误是由JDBC驱动程序生成的,并抱怨“第XX行的字符不是UTF -8字符”在Debug中运行使我能够看到更多详细信息。

I then concluded it must not be looking at the Latin collated table it would be filling, but was looking at the DB which was still set to UTF-8. 然后,我得出结论,它一定不是在查看将要填充的拉丁排序表,而是在查看仍设置为UTF-8的数据库。 Changing the DB to Latin was all I needed to do. 我需要做的就是将数据库更改为拉丁语。

I hope this will help others in the future. 我希望这会在将来对其他人有所帮助。

Pat

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM