简体   繁体   English

使用JDBC从Oracle 11g删除记录

[英]Deleting records from Oracle 11g with JDBC

I need to delete 10 000 records from a table containing 9 million records. 我需要从包含900万条记录的表中删除10000条记录。 The IDs which are to be deleted will be fetched from a complex query and stored in a Java collection. 将从复杂查询中获取要删除的ID,并将其存储在Java集合中。

I have 3 approaches to implement this 我有3种方法来实现这一点

1) Create a prepared statement and add 10000 statements to the batch and execute it. 1)创建一个准备好的语句,并将10000条语句添加到批处理中并执行它。

statement will look like this 声明将如下所示

Delete from <table_name> where id=?;

2) Write a 'in' query rather than using '=' in a batch. 2)写一个“ in”查询,而不是批量使用“ =”。 Like 喜欢

Under this, the 10 000 IDs can be created as comma separated values in Java code and added to the query. 在这种情况下,可以在Java代码中以逗号分隔的值创建10000个ID,并将其添加到查询中。 Or, 10000 IDs are inserted in to a temporary table and make a select from that table in the sub query. 或者,将10000个ID插入到临时表中,并在子查询中从该表中进行选择。

Delete from <table_name> where id in (<CSV>);
                 or
Delete from <table_name> where id in (select id from <temp_table>);

There are no constraints and indexes in the table. 表中没有约束和索引。 And I cannot add one, because I'm working on a existing table. 而且我无法添加一个,因为我正在处理现有表。

First option is taking ages to complete. 第一个选择是花很多时间才能完成。 It was running for 15hrs and still not completed. 它已运行15小时,但仍未完成。

You first version has a limit of 1000 values and tends to not perform well. 您的第一个版本限制为1000个值,并且效果通常不佳。 The second approach may perform better but you have to have a global temporary table and populating it is an extra step. 第二种方法可能会更好,但是您必须有一个全局临时表,并填充它是一个额外的步骤。

You can convert your Java collection to an Oracle collection. 您可以将Java集合转换为Oracle集合。 You can create your own table type for this, but there are built-in ones like ODCINUMBERLIST which you can use here. 您可以为此创建自己的表类型,但是可以在其中使用诸如ODCINUMBERLIST之类的内置表类型。 You can the treat that as a table collection expression. 您可以将其视为表集合表达式。

The details may vary slightly depending on your Java collection type, but the outline is something like: 具体细节可能会有所不同,具体取决于您的Java集合类型,但是大纲如下所示:

ArrayDescriptor aDesc = ArrayDescriptor.createDescriptor("SYS.ODCINUMBERLIST",
  conn);
oracle.sql.ARRAY oraIDs = new oracle.sql.ARRAY(aDesc, conn, yourJavaCollectinOfIDs);

cStmt = (OracleCallableStatement) conn.prepareCall(
  "Delete from <table_name> 
   where id in (select column_value from table(?))");
cStmt.setArray(1, oraIDs);
cStmt.execute();

Unless it is already a simple array, You will need to convert your Java collection to an array in the call; 除非它已经不是一个简单的数组,否则您将需要在调用中将Java集合转换为数组; eg if you're using an ArrayList called yourArrayList, you would do: 例如,如果您使用的是名为yourArrayList的ArrayList,则可以执行以下操作:

oracle.sql.ARRAY oraIDs = new oracle.sql.ARRAY(aDesc, conn, yourArrayList.toArray());

You will still suffer from the lack of a primary key or index but it will give Oracle a better chance to optimise it than the CSV list (or multiple CSV lists OR'd together as you have more than 1000 IDs). 您仍然会遭受缺少主键或索引的困扰,但是与CSV列表(或多个CSV列表或在一起,因为您拥有1000多个ID在一起)相比,它为Oracle提供了更好的优化机会。

You should not use the the first option by executing 10000 statements from your java code. 您不应该通过从Java代码执行10000条语句来使用第一个选项。

Creating a temp table is a good idea. 创建一个临时表是一个好主意。 But most of the time you can not have a IN (...) clause with more than 1000 items. 但是大多数情况下,您的IN (...)子句不能包含超过1000个项目。 So your approach with CSV may not success. 因此,您使用CSV的方法可能不会成功。

You may go for 你可以去

Delete from <table_name> where id in (select id from <temp_table>);

but this way is not optimized either. 但是这种方法也没有优化。 It would be better to change your delete statement into this: 最好将您的delete语句更改为:

Delete from <table_name> m where exists (select id from <temp_table> t where m.id = t.id);

But if you do such operations frequently it's highly recommended to add some constraints and indexes to you <table_name> and even to your <temp_table> . 但是,如果您经常执行此类操作,则强烈建议为您<table_name>甚至您的<temp_table>添加一些约束和索引。 It will boost your operations execution time like a charm. 它将像魅力一样增加您的操作执行时间。

The WHERE ... IN (...) is the way to go. WHERE ... IN(...)是必经之路。

The IN clause can reference a temporary table that you've populated (your original idea), or it can contain any chosen (fixed) number of ? IN子句可以引用您已填充的临时表(您的原始想法),也可以包含任何选定的(固定的)数量? parameters. 参数。 It will reduce the number of db roundtrips by a factor equal to the chosen number, but not necessarily to one. 它将使db往返次数减少等于选择的次数的因数,但不一定等于1。 Iterate over your collection and process it in chunks. 遍历您的集合并分块处理它。

Try like this. 尝试这样。

Delete from <table_name> where
    id in (1, 2, 3, ... ,1000)
    or id in (1001, 1002, ... , 2000)
    ....

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM