简体   繁体   English

SQL:查找连续几天没有的行组

[英]SQL: Find group of rows for consecutive days absent

I have the following Attendance table in my Microsoft SQL Server 2016: 我的Microsoft SQL Server 2016中有以下Attendance表:

ID         StudentID  Date          AbsenceReasonID
----------------------------------------------------
430957     10158      2018-02-02    2   
430958     10158      2018-02-03    2   
430959     10158      2018-02-04    11  
430960     12393      2018-03-15    9   
430961     1          2018-03-15    9   
430962     12400      2018-03-15    9   
430963     5959       2018-03-15    11  

I would like to have a query that retrieves a group of rows where 3 or MORE absences have occurred consecutively by the Date column for a single student ( StudentID ). 我想要一个查询,该查询通过一个学生的Date列( StudentID )检索连续发生3个或更多缺勤的一组行。 Ideally, the following data after running the query would be 理想情况下,运行查询后的以下数据应为

ID         StudentID  Date          AbsenceReasonID
----------------------------------------------------
430957     10158      2018-02-02    2   
430958     10158      2018-02-03    2   
430959     10158      2018-02-04    11  

Note that if a student is absent on a Friday, I would like that to carry through over the weekend to Monday (Disregard Weekend dates). 请注意,如果学生在星期五缺席,我希望周末继续进行到星期一(无视周末日期)。

If anymore information is required to better assist you in assisting me, please let me know. 如果需要更多信息以更好地帮助您为我提供帮助,请告诉我。 I have used the following query as a starter but know it is not what I am looking for: 我已经使用以下查询作为启动器,但知道这不是我要查找的内容:

SELECT 
    CONVERT(datetime, A.DateOF, 103),
    A.SchoolNum, EI.FullName,
    COUNT(A.SchoolNum) as 'Absences'
FROM 
    Attendance A
INNER JOIN 
    EntityInformation EI ON EI.SchoolNum = A.SchoolNum AND EI.Deleted = 0
INNER JOIN 
    Enrolment E ON EI.SchoolNum = E.SchoolNum AND E.Deleted = 0
GROUP BY 
    A.SchoolNum, A.DateOf, FullName
HAVING 
    COUNT(A.SchoolNum) > 1
    AND A.DateOf = GETDATE()
    AND A.SchoolNum in (SELECT SchoolNum FROM Attendance A1 
                        WHERE A1.DateOf = A.DateOf -7)

This is more of a static solution that retrieves absences where the student's ID occurred twice in the past 7 days. 这更多是一种静态解决方案,可以检索缺席情况,在该情况下,学生的ID在过去7天中出现过两次。 This is neither consecutive or three or more days. 这既不是连续天,也不是三天或更长时间。

If you need to get the absence in a time period (let's say in past 7 days), then you can do something like this 如果您需要在一段时间内(例如过去7天)缺席,则可以执行以下操作

 SELECT 
    ID,
    StudentID,
    [Date], 
    AbsenceReasonID
FROM(
SELECT 
    ID,
    StudentID,
    [Date], 
    AbsenceReasonID, 
    COUNT(StudentID) OVER(PARTITION BY StudentID ORDER BY StudentID) AS con, 
    ((DATEPART(dw, [Date]) + @@DATEFIRST) % 7) AS dw
FROM attendance
) D
WHERE 
     D.con > 2
AND [Date] >= '2018-02-02'
AND [Date] <= GETDATE()
AND dw NOT IN(0,1)

and based on your given data the output will be 根据您给定的数据,输出将是

|     ID | StudentID |       Date | AbsenceReasonID |
|--------|-----------|------------|-----------------|
| 430957 |     10158 | 2018-02-02 |               2 |

you could adjust the output as you like. 您可以根据需要调整输出。

SQL Fiddle SQL小提琴

Try this: 尝试这个:

CTE contains the absence dates when a student was absent on both the day before and the day after (excluding weekend). CTE包含前一天和后一天(周末除外)缺勤的缺勤日期。 The 2 UNION at the end add back the first and last of each group and eliminate the duplicates. 最后的2 UNION加回每个组的第一个和最后一个,并消除重复项。

with cte(id, studentId, dateof , absenceReasonId)
as
(
select a.* 
from attendance a
where exists (select 1 from attendance preva
              where preva.studentID = a.studentID
              and   datediff(day, preva.dateof, a.dateof)
                    <= (case when datepart(dw, preva.dateof) >= 5
                        then 8 - datepart(dw, preva.dateof)
                        else 1 
                        end)
              and preva.dateof < a.dateof)
and exists (select 1 from attendance nexta
              where nexta.studentID = a.studentID
              and   datediff(day, a.dateof, nexta.dateof)
                    <= (case when datepart(dw, a.dateof) >= 5
                        then 8 - datepart(dw, a.dateof)
                        else 1 
                        end)
              and nexta.dateof > a.dateof))              

select cte.*
from cte
union  -- use union to remove duplicates
select preva.* 
from
attendance preva
inner join
cte
on preva.studentID = cte.studentID
and preva.dateof < cte.dateof
and datediff(day, preva.dateof, cte.dateof)
                    <= (case when datepart(dw, preva.dateof) >= 5
                        then 8 - datepart(dw, preva.dateof)
                        else 1 
                        end) 
union
select nexta.*
from attendance nexta
inner join
cte
on nexta.studentID = cte.studentID
and   datediff(day, cte.dateof, nexta.dateof)
       <= (case when datepart(dw, cte.dateof) >= 5
                then 8 - datepart(dw, cte.dateof)
                else 1 
            end)
and nexta.dateof > cte.dateof  
order by studentId, dateof 

sqlfiddle sqlfiddle

You can use this to find your absence ranges. 您可以使用它来查找缺勤范围。 In here I use a recursive CTE to number all days from a few years while at the same time record their week day. 在这里,我使用递归CTE对几年中的所有天进行编号,同时记录其工作日。 Then use another recursive CTE to join absence dates for the same student that are one day after another, considering weekends should be skipped (read the CASE WHEN on the join clause). 然后,考虑到应该跳过周末,请使用另一个递归CTE来加入同一位学生的缺勤日期,因为考虑到周末应该被跳过(请阅读join子句中的CASE WHEN )。 At the end show each absence spree filtered by N successive days. 在最后的节目中,每个缺席狂欢连续N天过滤掉。

SET DATEFIRST 1 -- Monday = 1, Sunday = 7

;WITH Days AS
(
    -- Recursive anchor: hard-coded first date
    SELECT
        GeneratedDate = CONVERT(DATE, '2017-01-01')

    UNION ALL

    -- Recursive expression: all days until day X
    SELECT
        GeneratedDate = DATEADD(DAY, 1, D.GeneratedDate)
    FROM
        Days AS D
    WHERE
        DATEADD(DAY, 1, D.GeneratedDate) <= '2020-01-01'
),
NumberedDays AS
(
    SELECT
        GeneratedDate = D.GeneratedDate,
        DayOfWeek = DATEPART(WEEKDAY, D.GeneratedDate),
        DayNumber = ROW_NUMBER() OVER (ORDER BY D.GeneratedDate ASC)
    FROM
        Days AS D
),
AttendancesWithNumberedDays AS
(
    SELECT
        A.*,
        N.*
    FROM
        Attendance AS A
        INNER JOIN NumberedDays AS N ON A.Date = N.GeneratedDate
),
AbsenceSpree AS
(
    -- Recursive anchor: absence day with no previous absence, skipping weekends
    SELECT
        StartingAbsenceDate = A.Date,
        CurrentDateNumber = A.DayNumber,
        CurrentDateDayOfWeek = A.DayOfWeek,
        AbsenceDays = 1,
        StudentID = A.StudentID
    FROM
        AttendancesWithNumberedDays AS A
    WHERE
        NOT EXISTS (
            SELECT
                'no previous absence date'
            FROM
                AttendancesWithNumberedDays AS X
            WHERE
                X.StudentID = A.StudentID AND
                X.DayNumber = CASE A.DayOfWeek 
                    WHEN 1 THEN A.DayNumber - 3 -- When monday then friday (-3)
                    WHEN 7 THEN A.DayNumber - 2 -- When sunday then friday (-2)
                    ELSE A.DayNumber - 1 END)

    UNION ALL

    -- Recursive expression: find the next absence day, skipping weekends
    SELECT
        StartingAbsenceDate = S.StartingAbsenceDate,
        CurrentDateNumber = A.DayNumber,
        CurrentDateDayOfWeek = A.DayOfWeek,
        AbsenceDays = S.AbsenceDays + 1,
        StudentID = A.StudentID
    FROM
        AbsenceSpree AS S
        INNER JOIN AttendancesWithNumberedDays AS A ON
            S.StudentID = A.StudentID AND
            A.DayNumber = CASE S.CurrentDateDayOfWeek
                WHEN 5 THEN S.CurrentDateNumber + 3 -- When friday then monday (+3)
                WHEN 6 THEN S.CurrentDateNumber + 2 -- When saturday then monday (+2)
                ELSE S.CurrentDateNumber + 1 END
)
SELECT
    StudentID = A.StudentID,
    StartingAbsenceDate = A.StartingAbsenceDate,
    EndingAbsenceDate = MAX(N.GeneratedDate),
    AbsenceDays = MAX(A.AbsenceDays)
FROM
    AbsenceSpree AS A
    INNER JOIN NumberedDays AS N ON A.CurrentDateNumber = N.DayNumber
GROUP BY
    A.StudentID,
    A.StartingAbsenceDate
HAVING
    MAX(A.AbsenceDays) >= 3
OPTION
    (MAXRECURSION 5000)

If you want to list the original Attendance table rows, you can replace the last select: 如果要列出原始的出勤表行,则可以替换最后选择的行:

SELECT
    StudentID = A.StudentID,
    StartingAbsenceDate = A.StartingAbsenceDate,
    EndingAbsenceDate = MAX(N.GeneratedDate),
    AbsenceDays = MAX(A.AbsenceDays)
FROM
    AbsenceSpree AS A
    INNER JOIN NumberedDays AS N ON A.CurrentDateNumber = N.DayNumber
GROUP BY
    A.StudentID,
    A.StartingAbsenceDate
HAVING
    MAX(A.AbsenceDays) >= 3

with this CTE + SELECT : 与此CTE + SELECT

,
FilteredAbsenceSpree AS
(
    SELECT
        StudentID = A.StudentID,
        StartingAbsenceDate = A.StartingAbsenceDate,
        EndingAbsenceDate = MAX(N.GeneratedDate),
        AbsenceDays = MAX(A.AbsenceDays)
    FROM
        AbsenceSpree AS A
        INNER JOIN NumberedDays AS N ON A.CurrentDateNumber = N.DayNumber
    GROUP BY
        A.StudentID,
        A.StartingAbsenceDate
    HAVING
        MAX(A.AbsenceDays) >= 3
)
SELECT
    A.*
FROM
    Attendance AS A
    INNER JOIN FilteredAbsenceSpree AS F ON A.StudentID = F.StudentID
WHERE
    A.Date BETWEEN F.StartingAbsenceDate AND F.EndingAbsenceDate
OPTION
    (MAXRECURSION 5000)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM