简体   繁体   English

两个条目之间相同字段的MySQL比较-查询永远

[英]MySQL comparison of same field between two entries — Query takes forever

I'm completely useless regarding databases, but currently I'm having to work with it. 我对数据库完全没有用,但是目前我不得不使用它。

I need to make a query that compares date values between to different entries of my table. 我需要查询将日期值与表的不同条目进行比较。 I have a query like this: 我有这样的查询:

SELECT t1.serial_number, t1.fault_type, t2.fault_type 
FROM  shipped_products t1 
      JOIN shipped_products t2 ON t1.serial_number=t2.serial_number 
WHERE ABS(DATEDIFF(t2.date_rcv,t1.date_rcv))<90;

But it's taking forever to run. 但是这需要永远的时间。 Really, I left it running for 18 hours and it never stoped. 真的,我让它运行了18个小时,而且它从未停止过。 Is this query correct? 这个查询正确吗? Is there a better, more clever way to do this? 有没有更好,更聪明的方法来做到这一点?

Thank you very much guys. 非常感谢你们。

BTW: I'll automate all the process with python scripts, so if you know of a better way to do this inside python without all the logic having to be inside the query, it would also help. 顺便说一句:我将使用python脚本自动执行所有过程,因此,如果您知道在python内部执行此操作的更好方法,而不必将所有逻辑都包含在查询中,那么它也会有所帮助。

EDIT: My question seems unclear, so I'll explain better what I need to do. 编辑:我的问题似乎不清楚,所以我会更好地解释我需要做的事情。 I have a problem that sometimes products go to repair centers and are shipped back to clients as "No Deffect found". 我有一个问题,有时产品会送至维修中心,并以“未找到缺陷”的形式运回给客户。 After that the client ship it againg to repair centers for they present the same issue. 之后,客户再次将其运送到维修中心,因为他们提出了同样的问题。 So i need a query to count how many products have been to repair centers twice in an interval of 90 days. 因此,我需要一个查询来统计每隔90天两次维修中心有多少产品。 The unic ID for each single product is its serial number, and that's why I'm searching for sereal number duplicates. 每个产品的唯一ID是它的序列号,这就是为什么我要搜索实际数字重复的原因。

Every record is going to match itself (in t1 and t2) in this join since the DateDiff will be the same and thus less than 90. Make sure you are not matching to the same record. 在此联接中,每个记录都将自己匹配(在t1和t2中),因为DateDiff相同,因此小于90。请确保您不匹配同一记录。 If you have an ID field in your table you could do this: 如果表中有一个ID字段,则可以执行以下操作:

SELECT t1.serial_number, t1.fault_type, t2.fault_type 
FROM  shipped_products t1 
      JOIN shipped_products t2 
      ON t1.serial_number=t2.serial_number 
      AND t1.ID <> t2.ID
WHERE ABS(DATEDIFF(t2.date_rcv,t1.date_rcv))<90;

Also make sure you have a key on serial_number. 还要确保您在serial_number上有一个密钥。

It is unclear to me why you would want duplicates in the results. 我不清楚,为什么要在结果中重复。 If you have two rows that meet the condition, then both will be in the result set. 如果您有两行满足条件,则这两行都将在结果集中。 Why not just look at records that come later? 为什么不只看以后的记录呢? If you phrase the query like this: 如果您用以下方式查询语句:

SELECT t1.serial_number, t1.fault_type, t2.fault_type 
FROM  shipped_products t1 JOIN
      shipped_products t2
     ON t1.serial_number = t2.serial_number 
WHERE t2.date_recv >= t1.date_rcv and
      t2.date_recv < t1.date_recv + interval 90 day;

Then the resulting query can take advantage of an index on shipped_products(serial_number, date_recv) . 然后,生成的查询可以利用shipped_products(serial_number, date_recv)上的索引。 Note: Perhaps the 90 should be 180. 注意:也许90应该是180。

I am suspicious when I see this type of self-join. 当我看到这种自我连接时,我很怀疑。 Sometimes, it can be replaced with an aggregation query (sometimes not). 有时,可以用聚合查询代替它(有时不能)。 However, what you actually want to do is unclear. 但是,您实际要做什么尚不清楚。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM