简体   繁体   English

SQL按日期分组冲突

[英]SQL Group By Date Conflicts

I have a table with columns start_date and end_date. 我有一个表,其中包含start_date和end_date列。 What we need to do is Select everything and group them by date conflicts for each Object_ID. 我们需要做的是为每个Object_ID选择所有内容并按日期冲突将它们分组。

A date conflict is when a row's start date and/or end date pass through another rows'. 日期冲突是指一行的开始日期和/或结束日期经过另一行。 For instance, here are some examples of conflicts: 例如,以下是一些冲突示例:

Row 1 has dates 1st through the 5th, Row 2 has dates 2nd through the 3rd. 第1行的日期为1至5,第2行的日期为2至3。

Row 1 has dates 2nd through the 5th, Row 2 has dates 1st through the 3rd. 第1行的日期为2至5,第2行的日期为1至3。

Row 1 has dates 2nd through the 5th, Row 2 has dates 3rd through the 6th. 第1行的日期为2至5,第2行的日期为3至6。

Row 1 has dates 2nd through the 5th, Row 2 has dates 1st through the 7th. 第1行的日期为2至5,第2行的日期为1至7。

So for example, if we have some sample data (assume the numbers are just days of the month for simplicity): 因此,例如,如果我们有一些示例数据(为简单起见,假设数字仅是一个月中的几天):

id | object_id | start_date | end_date
1  | 1         | 1          | 5
2  | 1         | 2          | 4
3  | 1         | 6          | 8
4  | 2         | 2          | 3

What i would expect to see is this: 我希望看到的是:

object_id | start_date | end_date | numconflicts
1         | <na>       | <na>     | 2
1         | 6          | 8        | 0 or null
2         | 2          | 3        | 0 or null

And for a Second Test Case, Here is some sample data: 对于第二个测试用例,以下是一些示例数据:

id | object_id | start_date | end_date
1  | 1         | 1          | 5
2  | 1         | 2          | 4
3  | 1         | 6          | 8
4  | 2         | 2          | 3
5  | 2         | 4          | 5
6  | 1         | 2          | 3
7  | 1         | 10         | 12
8  | 1         | 11         | 13

And for the second Test Case, what I would expect to see as output: 对于第二个测试用例,我希望看到的输出是:

object_id | start_date | end_date | numconflicts
1         | <na>       | <na>     | 3
1         | 6          | 8        | 0 or null
2         | 2          | 3        | 0 or null
2         | 4          | 5        | 0 or null
1         | <na>       | <na>     | 2

Yes, I will need some way of differentiating the first and the second grouping (the first and last rows) but I haven't quite figured that out. 是的,我将需要区分第一组和第二组(第一行和最后一行)的方法,但我还没有弄清楚。 The goal is to view this list, and then when you click on a group of conflicts you can view all of the conflicts in that group. 目的是查看此列表,然后在单击一组冲突时可以查看该组中的所有冲突。

My first thought was to attempt some GROUP BY CASE ... clause but I just wrapped by head around itself. 我的第一个想法是尝试使用GROUP BY CASE ...子句,但我只是被自己包裹着。

The language I am using to call mysql is php. 我用来呼叫mysql的语言是php。 So if someone knows of a php-loop solution rather than a large mysql query i am all ears. 因此,如果有人知道一个php循环解决方案,而不是一个大型的mysql查询,我就会不知所措。

Thanks in advance. 提前致谢。

Edit: Added in primary Keys to provide a little less confusion. 编辑:在主键中添加以减少混乱。

Edit: Added in a Test case 2 to provide some more reasoning. 编辑:在测试用例2中添加以提供更多推理。

This query finds the number of duplicates: 该查询查找重复项的数量:

select od1.object_id, od1.start_date, od1.end_date, sum(od2.id is not null) as dups
from object_date od1
left join object_date od2
    on od2.object_id = od1.object_id
    and od2.end_date >= od1.start_date
    and od2.start_date <= od1.end_date
    and od2.id != od1.id
group by 1,2,3;

You can use this query as the basis of a query that gives you exactly what you asked for (see below for output). 您可以将此查询用作查询的基础,该查询可以为您提供所需的确切信息(有关输出,请参见下文)。

select
  object_id,
  case dups when 0 then start_date else '<na>' end as start_date,
  case dups when 0 then end_date else '<na>' end as end_date,
  sum(dups) as dups
from (
  select od1.object_id, od1.start_date, od1.end_date, sum(od2.id is not null) as dups
  from object_date od1
  left join object_date od2
    on od2.object_id = od1.object_id
    and od2.end_date >= od1.start_date
    and od2.start_date <= od1.end_date
    and od2.id != od1.id
  group by 1,2,3) x
group by 1,2,3;

Note that I have used an id column to distinguish the rows. 请注意,我使用了id列来区分行。 However, you could replace the test of id's not matching with comparisons on every column, ie replace od2.id != od1.id with tests that every other column is not equal, but that would require a unique index on all the other columns to make sense, and having an id column is a good idea anyway. 但是,您可以将id不匹配的测试替换为每列上的比较,即用其他每列都不相等的测试替换od2.id != od1.id ,但是这需要在所有其他列上使用唯一索引有意义,无论如何,拥有id列是一个好主意。

Here's a test using your data: 这是使用您的数据的测试:

create table object_date (
    id int primary key auto_increment,
    object_id int,
    start_date int,
    end_date int
);
insert into object_date (object_id, start_date, end_date) 
    values (1,1,5),(1,2,4),(1,6,8),(2,2,3);

Output of first query when run against this sample data: 针对此样本数据运行时,第一个查询的输出:

+-----------+------------+----------+------+
| object_id | start_date | end_date | dups |
+-----------+------------+----------+------+
|         1 |          1 |        5 |    1 |
|         1 |          2 |        4 |    1 |
|         1 |          6 |        8 |    0 |
|         2 |          2 |        3 |    0 |
+-----------+------------+----------+------+

Output of second query when run against this sample data: 针对此样本数据运行时第二个查询的输出:

+-----------+------------+----------+------+
| object_id | start_date | end_date | dups |
+-----------+------------+----------+------+
|         1 | 6          | 8        |    0 |
|         1 | <na>       | <na>     |    2 |
|         2 | 2          | 3        |    0 |
+-----------+------------+----------+------+

Oracle : This could be done with a subquery in a group by CASE statement. Oracle:这可以通过CASE语句在组中的子查询来完成。

https://forums.oracle.com/forums/thread.jspa?threadID=2131172 https://forums.oracle.com/forums/thread.jspa?threadID=2131172

Mysql : You could have a view which had all the conflicts . Mysql:您可能会看到所有冲突的视图。

select distinct a1.appt, a2.appt from appointment a1, appointment a2 where a1.start < a2.end and a1.end > a2.start. 从约会a1和约会a2中选择不同的a1.appt,a2.appt,其中a1.start <a2.end和a1.end> a2.start。

and then simply do a count(*) on that table. 然后只需对该表执行count(*)。

Something like the following should work: 类似于以下内容的东西应该起作用:

select T1.object_id, T1.start_date, T1.end_date, count(T1.object_id) as numconflicts
from T1
inner join T2 on T1.start_date between T2.start_date and T2.end_date
inner join T3 on T1.end_date between T2.start_date and T2.end_date
group by T1.object_id

I might be off a little bit, but it should help you get started. 我可能会有点不舒服,但这应该可以帮助您入门。

Edit : Indented it properly 编辑 :正确缩进

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM