[英]Java: Best way to generate SQLs after parsing a xml file Using JAXB and insert to database without duplicates?
I've been assigned a task to unmarshal a XML file using JAXB and generate corresponding SQLs and fire to database. 我被分配了使用JAXB解封XML文件并生成相应的SQL并发射到数据库的任务。 I've used following method to generate the list of SQLS. 我使用以下方法生成SQLS列表。
public List<String> getSqlOfNationalityList(File file)throws JAXBException, FileNotFoundException, UnsupportedEncodingException {
List<String> unNationalityList = new ArrayList<String>();
JAXBContext jaxbcontext = JAXBContext.newInstance(ObjectFactory.class);
Unmarshaller unmarshaller = jaxbcontext.createUnmarshaller();
CONSOLIDATEDLIST consolidate = (CONSOLIDATEDLIST) unmarshaller.unmarshal(file);
// accessing individuals properties
INDIVIDUALS individuals = consolidate.getINDIVIDUALS();
List<INDIVIDUAL> list = individuals.getINDIVIDUAL();
for (INDIVIDUAL individual : list) {
NATIONALITY nationality = individual.getNATIONALITY();
if (nationality != null) {
List<String> values = nationality.getVALUE();
if (values != null) {
for (String value : values) {
String string2 = "";
StringBuffer builder = new StringBuffer();
builder.append("INSERT INTO LIST_UN_NATIONALITY");
builder.append("(" + "\"DATAID\"" + "," + "\"VALUE\"" + ")");
builder.append(" " + "VALUES(");
string2 = string2.concat("'" + individual.getDATAID() + "'" + ",");
if ("null ".contentEquals(value + " ")) {
string2 = string2.concat("' '" + ",");
} else {
string2 = string2.concat("'" + value.replace("'", "/") + "'" + ",");
}
if (string2.length() > 0) {
builder.append(string2.substring(0, string2.length() - 1));
}
builder.append(");");
builder.append("\r\n");
unNationalityList.add(builder.toString());
}
}
}
}
return unNationalityList;
}// end of file nationality List
I have used following method to read from the list and insert into database. 我使用以下方法从列表中读取并插入数据库。
private void readListAndInsertToDb(List<String> list) {
int duplicateCount = 0;
int totalCount = 0;
try {
for (String sql : list) {
try {
int i = jdbcTemplate.update(sql);
} catch (DuplicateKeyException dke) {
// dke.printStackTrace();
duplicateCount++;
} catch (DataAccessException e) {
e.printStackTrace();
}
totalCount++;
} // end of for
} catch (SQLException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
System.out.println("\r\nTotal : " + totalCount);
System.out.println("Total duplicate : " + duplicateCount);
}
Now the issue is, I've about 13-14 similar type of lists. 现在的问题是,我大约有13-14个类似类型的列表。 And the xml file consists of records which may already exist in database. xml文件由数据库中可能已经存在的记录组成。
Nono, don't generate a list of SQL statements. 否,请勿生成SQL语句列表。 Especially don't interpolate them as strings! 尤其不要将它们作为字符串插入! . 。 Awoogah, awoogah, SQL injection alert. Awoogah,awoogah, SQL注入警报。
Don't use a try/catch approach for duplicate handling either. 也不要使用try / catch方法进行重复处理。
Improvements are, from simple and easy to harder but best: 从简单到容易,再到最困难但又最好的改进是:
At bare minimum use a PreparedStatement
with bind parameters. 至少要使用带有绑定参数的PreparedStatement
。 Prepare it once. 准备一次。 Then execute it for each input, with the parameters from the current data row. 然后使用当前数据行中的参数对每个输入执行该命令。
You cannot rely on drivers throwing DuplicateKeyException
and you should also catch SQLException
and check the SQLSTATE. 您不能依赖于引发DuplicateKeyException
驱动程序,还应该捕获SQLException
并检查SQLSTATE。 Unless of course you plan on using one specific DBMS and your code checks that you're using the expected driver + version. 当然,除非您计划使用一种特定的DBMS,并且您的代码会检查您使用的是预期的驱动程序+版本。
Better, use PostgreSQL's INSERT ... ON CONFLICT DO NOTHING
feature to handle conflicts without needing exception handling. 更好的是,使用PostgreSQL的INSERT ... ON CONFLICT DO NOTHING
功能来处理冲突,而无需进行异常处理。 This lets you batch your inserts, doing many per transaction for better performance. 这样一来,您就可以批量处理插入内容,每笔交易要处理很多次以获得更好的性能。
Further improve performance by using a multi-row VALUES
list for INSERT ... ON CONFLICT DO NOTHING
. 通过将多行VALUES
列表用于INSERT ... ON CONFLICT DO NOTHING
进一步提高性能。
Even better, COPY
all the data, including duplicates, into a TEMPORARY
table using PgJDBC's CopyManager
interface (see PGconnection.getCopyAPI()
), create an index on the key used for duplicate detection, then LOCK
the destination table and do a bulk 更妙的是, COPY
所有的数据,包括重复,变成了TEMPORARY
表使用PgJDBC的CopyManager
接口(见PGconnection.getCopyAPI()
创建用于重复检测的关键指标,则LOCK
目标表,并做了散装
INSERT INTO real_table SELECT ... FROM temp_table WHERE NOT EXISTS (SELECT 1 FROM real_table WHERE temp_table.key = real_table.key)
or similar. 或类似。 This will be way faster. 这将是这样更快。 You can use INSERT ... ON DUPLICATE NO ACTION
instead if you're on a new enough PostgreSQL. 如果您使用的是足够新的PostgreSQL,则可以改用INSERT ... ON DUPLICATE NO ACTION
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.