简体   繁体   中英

Is this database dump design ok?

I have written a Java program to do the following and would like opinions on my design:

  1. Read data from a CSV file. The file is a database dump with 6 columns.
  2. Write data into a MySQL database table.

The database table is as follows:

    CREATE TABLE MYTABLE
    (
   ID int PRIMARY KEY not null auto_increment,
   ARTICLEID int,
   ATTRIBUTE varchar(20),
   VALUE text,
   LANGUAGE smallint,
   TYPE smallint
    );
  1. I created an object to store each row.
  2. I used OpenCSV to read each row into a list of objects created in 1.
  3. Iterate this list of objects and using PreparedStatements, I write each row to the database.

The solution should be highly amenable to the changes in requirements and demonstrate good approach, robustness and code quality.

Does that design look ok?

Another method I tried was to use the 'LOAD DATA LOCAL INFILE' sql statement. Would that be a better choice?

EDIT: I'm now using OpenCSV and it's handling the issue of having commas inside actual fields. The issue now is nothing is writing to the DB. Can anyone tell me why?

public static void exportDataToDb(List<Object> data) {
    Connection conn = connect("jdbc:mysql://localhost:3306/datadb","myuser","password");

    try{
        PreparedStatement preparedStatement = null;
        String query = "INSERT into mytable (ID, X, Y, Z) VALUES(?,?,?,?);";
        preparedStatement = conn.prepareStatement(query);

        for(Object o : data){   
            preparedStatement.setString(1, o.getId());
            preparedStatement.setString(2, o.getX());
            preparedStatement.setString(3, o.getY());
            preparedStatement.setString(4, o.getZ());
        }
        preparedStatement.executeBatch();

    }catch (SQLException s){
        System.out.println("SQL statement is not executed!");
    }
}

From a purely algorithmic perspective, and unless your source CSV file is small, it would be better to

  1. prepare your insert statement
  2. start a transaction
  3. load one (or a few) line(s) from it
  4. insert the small batch into your database
  5. return to 3. while there are some lines remainig
  6. commit

This way, you avoid loading the entire dump in memory.

But basically, you probably had better use LOAD DATA .

If the no. of rows is huge, then the code will fail at Step 2 with out of memory error. You need to figure out a way to get rows in chunks and perform a batch with prepared statement for that chunk, continue till all the rows are processed. This will work for any no. of rows and also the batching will improve performance. Other than this I don't see any issue with the design.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM