[英]Importing data from CSV to SQLite table in Java
I have a .csv file with 500K
records where each record has 4
columns. 我有一个具有
500K
记录的.csv文件,其中每个记录有4
列。 I want all of these records to get imported into an SQLite table in Java (JDBC). 我希望所有这些记录都可以导入到Java(JDBC)中的SQLite表中。
I have tried using executeUpdate()
and executeBatch()
, but both of these are really slow. 我已经尝试过使用
executeUpdate()
和executeBatch()
,但是这两个executeBatch()
确实很慢。 They process 400-500
records per minute. 他们每分钟处理
400-500
条记录。
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.Date;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.text.ParseException;
import java.sql.*;
public class MyClass{
public static void main(String[] args) throws FileNotFoundException, ParseException, SQLException, ClassNotFoundException{
Connection c = null;
Statement stmt = null;
try {
Class.forName("org.sqlite.JDBC");
c = DriverManager.getConnection("jdbc:sqlite:mydb.db");
stmt = c.createStatement();
String drop_sql = "DROP TABLE IF EXISTS MyTable";
stmt.executeUpdate(drop_sql);
String create_sql = "CREATE TABLE MyTable " +
"(VAR1 CHAR(50) NOT NULL, " +
"VAR2 CHAR(10) PRIMARY KEY NOT NULL," +
" VAR3 TEXT NOT NULL, " +
" VAR4 TEXT NOT NULL )";
stmt.executeUpdate(create_sql);
File premFile = new File("MyFile.csv");
DateFormat df = new SimpleDateFormat("dd/MM/yyyy");
Scanner scanner = new Scanner(premFile);
scanner.useDelimiter(",");
int i = 0, count = 500000;
while (i < count){
String myRecord = scanner.nextLine();
String[] cols = myRecord.split(",");
String var1 = cols[0];
String var2 = cols[1];
Date var3 = df.parse(cols[2]);
Date var4 = df.parse(cols[3]);
String query = "INSERT INTO MyTable VALUES (" +
"'" + var1 + "', " +
"'" + var2 + "', " +
"'" + var3 + "', " +
"'" + var4 + "')";
stmt.addBatch(query);
i++;
}
stmt.executeBatch();
stmt.close();
c.close();
} catch ( Exception e ) {
System.err.println( e.getClass().getName() + ": " + e.getMessage() );
System.exit(0);
}
}
}
If I go the SQLite way, and import the csv to the table using .import my_file.csv my_table
, I get the full task done within seconds. 如果我使用SQLite的方式,并使用
.import my_file.csv my_table
将csv导入到表中,则可以在几秒钟内完成整个任务。 Is there any way to achieve a similar performance by using only Java code? 是否可以通过仅使用Java代码来达到类似的性能?
I tried PreparedStatement
, but it did not have visible improvements. 我尝试了
PreparedStatement
,但是没有明显的改进。
I think your biggest problem here might be that you are going back to the file on each iteration, I would try loading the lines into an array and processing from there. 我认为这里最大的问题可能是您要在每次迭代中返回文件,我会尝试将行加载到数组中并从那里进行处理。
PS You might not want to use scanner.useDelimiter(",")
as you are using scanner.nextLine()
anyway and not scanner.next()
. PS:您可能不希望一直使用
scanner.useDelimiter(",")
因为您一直都在使用scanner.nextLine()
,而不是不是scanner.next()
。 I believe this does nothing, although I might be incorrect in saying so, give it a go. 我相信这无能为力,尽管我可能会说错,但还是试试吧。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.