简体   繁体   English

这种数据库转储设计可以吗?

[英]Is this database dump design ok?

I have written a Java program to do the following and would like opinions on my design: 我已经编写了一个Java程序来执行以下操作,并希望对我的设计发表意见:

  1. Read data from a CSV file. 从CSV文件读取数据。 The file is a database dump with 6 columns. 该文件是具有6列的数据库转储。
  2. Write data into a MySQL database table. 将数据写入MySQL数据库表。

The database table is as follows: 数据库表如下:

    CREATE TABLE MYTABLE
    (
   ID int PRIMARY KEY not null auto_increment,
   ARTICLEID int,
   ATTRIBUTE varchar(20),
   VALUE text,
   LANGUAGE smallint,
   TYPE smallint
    );
  1. I created an object to store each row. 我创建了一个对象来存储每一行​​。
  2. I used OpenCSV to read each row into a list of objects created in 1. 我使用OpenCSV将每一行读入1中创建的对象的列表中。
  3. Iterate this list of objects and using PreparedStatements, I write each row to the database. 循环访问此对象列表,并使用PreparedStatements将每一行写入数据库。

The solution should be highly amenable to the changes in requirements and demonstrate good approach, robustness and code quality. 该解决方案应高度适应需求的变化,并具有良好的方法,鲁棒性和代码质量。

Does that design look ok? 这样的设计看起来还可以吗?

Another method I tried was to use the 'LOAD DATA LOCAL INFILE' sql statement. 我尝试的另一种方法是使用“ LOAD DATA LOCAL INFILE” sql语句。 Would that be a better choice? 那会是更好的选择吗?

EDIT: I'm now using OpenCSV and it's handling the issue of having commas inside actual fields. 编辑:我现在正在使用OpenCSV,它正在处理在实际字段中包含逗号的问题。 The issue now is nothing is writing to the DB. 现在的问题是什么都没有写到数据库。 Can anyone tell me why? 谁能告诉我为什么?

public static void exportDataToDb(List<Object> data) {
    Connection conn = connect("jdbc:mysql://localhost:3306/datadb","myuser","password");

    try{
        PreparedStatement preparedStatement = null;
        String query = "INSERT into mytable (ID, X, Y, Z) VALUES(?,?,?,?);";
        preparedStatement = conn.prepareStatement(query);

        for(Object o : data){   
            preparedStatement.setString(1, o.getId());
            preparedStatement.setString(2, o.getX());
            preparedStatement.setString(3, o.getY());
            preparedStatement.setString(4, o.getZ());
        }
        preparedStatement.executeBatch();

    }catch (SQLException s){
        System.out.println("SQL statement is not executed!");
    }
}

From a purely algorithmic perspective, and unless your source CSV file is small, it would be better to 从纯算法的角度来看,除非源CSV文件很小,否则最好这样做

  1. prepare your insert statement 准备您的插入语句
  2. start a transaction 开始交易
  3. load one (or a few) line(s) from it 从其中加载一(或几)行
  4. insert the small batch into your database 将小批量插入数据库
  5. return to 3. while there are some lines remainig 返回3。
  6. commit 承诺

This way, you avoid loading the entire dump in memory. 这样,您可以避免将整个转储加载到内存中。

But basically, you probably had better use LOAD DATA . 但基本上,您最好使用LOAD DATA

If the no. 如果没有。 of rows is huge, then the code will fail at Step 2 with out of memory error. 行数很大,则代码将在步骤2中失败,并显示内存不足错误。 You need to figure out a way to get rows in chunks and perform a batch with prepared statement for that chunk, continue till all the rows are processed. 您需要找出一种方法来获取大块中的行,并使用该块的准备好的语句执行批处理,然后继续直到处理完所有行。 This will work for any no. 这将适用于任何否。 of rows and also the batching will improve performance. 行和批处理将提高性能。 Other than this I don't see any issue with the design. 除此之外,我认为设计没有任何问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM