简体   繁体   English

使用另一个表从表中删除记录?

[英]delete records from table using another table?

note: to the editors: please edit the title if have a better one :) 注意:给编辑:如果标题更好,请编辑标题:)

my question is: 我的问题是:

I have two tables in my database 我的数据库中有两个表

     -----------
     | table1   |
     |----------|
     | id       |
     |text      |
     ===========


     -----------
     | table2   |
     |----------|
     | id       |
     |text      |
     ===========

table1 is 600,000 records table1是600,000条记录

table2 is 5,000,000 records !!:) table2是5,000,000条记录!! :)

what is the best way to delete all the records in table2 that are not in table1 删除table2中所有不在table1中的记录的最佳方法是什么

I main by the way -the fastest way because I don't want to wait 4 hours to complete the process 我主要是-最快的方法,因为我不想等待4个小时来完成该过程

do you have something better than the following code: 你有什么比以下代码更好的东西:

<?PHP
   $sql = "select text from table2";
   $result = mysql_query($sql) or die(mysql_error());
   while($row = mysql_fetch_array($result)){
        $text = $row["text"];
        $sql2 = "select id from table1 where text = '$text'";
        $query2 = mysql_query($sql2) or die(mysql_error());
        $result2 = mysql_num_rows($query2);
        if($result2==0){
             $sql3 = "delete from table2 where text = '$text'";
             $query3 = mysql_query($sql3) or die(mysql_error());
        }
   }
?>

Thanks 谢谢

what about letting the RDBM handle it ? 让RDBM处理该怎么办?

for example 例如

DELETE FROM table2 WHERE text NOT IN (select distinct text from table1)

Cheers 干杯

PS: do some backup before testing ... PS:在测试之前做一些备份...

Your solution is doing something like 2 queries per line in the table2 table -- which means a couple of million queries -- which will be rather slow ^^ 您的解决方案在table2表中每行执行2个查询,这意味着数百万个查询,这将非常慢^^

Using MySQL, you might be able to delete all this in only one query : the DELETE instruction can be used to delete data from multiple-tables. 使用MySQL,您可能仅能在一个查询中删除所有这些内容: DELETE指令可用于从多个表中删除数据。

First thing is to write the select instruction that will match the data you want to delete (it's a better way to test than trying a delete without knowing if it'll really deal with the right data) ; 第一件事是编写将与您要删除的数据相匹配的选择指令(比不知道是否会真正处理正确的数据而尝试删除而言,这是一种更好的测试方法) something like this might do : 这样的事情可能会做:

select table2.*
from table2
    left join table1 on table1.text = table2.text
where table1.id is NULL

This should get you all data that is in table2, but is not in table1. 这将为您提供所有在table2中但不在table1中的数据。

Once you are sure this query is getting the right data, you can transform it to a delete query : 一旦确定此查询获取了正确的数据,就可以将其转换为删除查询:

delete table2
from table2
    left join table1 on table1.text = table2.text
where table1.id is NULL

This might do -- of course, it would be best to first test on a test database, and not on your production one ! 这可能会做-当然,最好是首先在测试数据库上进行测试,而不是在生产数据库上进行测试!

Else, something with an IN and a subquery might do ; 否则,带有IN和子查询的内容可能会起作用; a bit like 有一点像

delete
from table2
where text not in (select text from table1)

Not sure what will be faster, though, considering the amount of data you have -- still either way, I would not do the kind of PHP loop you proposed, but would go with a SQL query that can delete everything by itself : avoiding all those calls from PHP to the DB will most certainly make things faster ! 不过,考虑到您拥有的数据量,不确定是哪种方法会更快-仍然无论哪种方式,我都不会执行您建议的那种PHP循环,而是会使用一个SQL查询,该查询可以自行删除所有内容:避免所有操作从PHP到DB的那些调用无疑将使事情变得更快!

Why not add a new column to table2 that is one byte and then just do an update setting that byte to true or 'Y' if that row is in both tables. 为什么不向table2中添加一个字节的新列,然后将其设置为true或“ Y”(如果该行位于两个表中),则进行更新。

Then, just delete the rows that don't have this one column set. 然后,只需删除没有设置这一栏的行。

That would seem to be the simplest and fastest, IMO. IMO似乎是最简单,最快的。

Try this: 尝试这个:

DELETE table2 Where id NOT IN (SELECT id from table1)

Note: Make a backup before running the query 注意:在运行查询之前进行备份

Create table3 like table2 insert into table3 (SELECT table2.ID, TABle2.TEXT from table1 join table2 on ...) drop table2 alter table3 new name table2 创建类似于table2的table3插入到table3中(从table1中选择TABLE2.ID,TABle2.TEXT加入...的table2上)放置table2 alter table3新名称table2

Involves a bit of management (so it's only a valid option if you can easily drop/alter tables), but at least the DML part will outperform any other option, methinks. 涉及到一些管理(因此,如果您可以轻松地删除/更改表,那么这只是一个有效的选择),但是至少DML部分会胜过其他任何选择(方法)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM