简体   繁体   中英

MySql counting the number of groups of rows containing a certain value

How can I get the number of "groups" of a status, where status == 0, excluding groups which start the table and groups that span <= hour? (If the time constraint is too difficult, we can alternatively exclude groups with counts <= 40 instead of groups spanning <= hour, since a row is logged about every 1:30 minutes.)

For example, the following SAMPLE table WITHOUT the time constraint would produce 3 if grouping by status == 0.

+------+----------+----------+
| id   | status   |time      |
+------+----------+----------+
| 0001 | 1        |11:32:48  |
+------+----------+----------+
| 0002 | 0        |11:30:26  |
+------+----------+----------+
| 0003 | 0        |11:28:54  |
+------+----------+----------+
| 0004 | 1        |11:27:23  |
+------+----------+----------+
| 0005 | 0        |11:25:52  |
+------+----------+----------+
| 0006 | 1        |11:24:20  |
+------+----------+----------+
| 0007 | 1        |11:22:48  |
+------+----------+----------+
| 0008 | 0        |11:21:17  |
+------+----------+----------+
| 0009 | 0        |11:19:45  |
+------+----------+----------+
| 0010 | 0        |11:18:14  |
+------+----------+----------+
| 0011 | 0        |11:16:43  |
+------+----------+----------+
| 0012 | 0        |11:15:11  |
+------+----------+----------+
| 0013 | 0        |11:13:39  |
+------+----------+----------+
| 0002 | 0        |11:12:08  |
+------+----------+----------+
| 0014 | 1        |11:10:37  |
+------+----------+----------+
| 0015 | 1        |11:09:05  |
+------+----------+----------+
| 0016 | 1        |11:07:33  |
+------+----------+----------+
| 0017 | 0        |11:06:02  |
+------+----------+----------+

One solution I can think of would be to grab the entire table and produce the result with Java, but I am afraid this would be too inefficient given that the table can have millions of entries.

select sum(is_different_from_previous) , status
from (
    select status, 
    (@prevStatus <> status and @prevStatus <> -1) is_different_from_previous,
    @prevStatus := status
    from myTable t1
    cross join (select @prevStatus := -1) t2
    order by t1.time
) t1 group by status

for a specific status

select * from (
    select sum(is_different_from_previous) , status
    from (
        select status, 
        (@prevStatus <> status and @prevStatus <> -1) is_different_from_previous,
        @prevStatus := status
        from myTable t1
        cross join (select @prevStatus := -1) t2
        order by t1.time
    ) t1 group by status
) t1 where status = 0

Edit

To only count groups with a certain # of 0s

select count(*) from (
    select * from (
        select status, 
        (@prevStatus <> status and @prevStatus <> -1) is_different_from_previous,
        if(@prevStatus <> status and @prevStatus <> -1,@groupNumber := @groupNumber + 1, @groupNumber) groupNumber,
        @prevStatus := status
        from myTable t1
        cross join (select @prevStatus := -1, @groupNumber := 0) t2
        order by t1.id
    ) t1
    where status = 0
    group by groupNumber
    having count(*) > 4
) t1

http://sqlfiddle.com/#!9/e4a49/23

Try the following modified query, which is more efficient than the earlier one, because another table scan is eliminated and we restrict the data to only the last one hour. Also, the first group is not counted.

EDIT : I changed the JOIN condition back to st2.id = st1.id+1 to satisfy the requirements.

select 
  st1.status, 
  count(st1.id)
from sampletable st1
inner join sampletable st2
on (st2.id = st1.id+1 and st2.status <> st1.status)
where st1.status = 0 AND st1.time >= DATE_SUB(NOW(), INTERVAL 1 hour)
group by st1.status;

Updated SQL Fiddle demo with same id, status data :

SQL Fiddle demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM