简体   繁体   中英

CASE with multiple condition - Teradata/SQL

My dataset looks like this in teradata:

╔═══════════╦══════════╦══════╗
║ studentid ║   date   ║ days ║
╠═══════════╬══════════╬══════╣
║      1000 ║ 2/1/2017 ║   25 ║
║      1000 ║ 3/8/2017 ║   30 ║
║      1000 ║ 4/4/2017 ║   80 ║
║      1000 ║ 5/1/2017 ║   81 ║
║      1001 ║ 1/1/2017 ║   60 ║
║      1001 ║ 2/1/2017 ║   20 ║
║      1001 ║ 4/1/2017 ║   81 ║
╚═══════════╩══════════╩══════╝

I would like to have a new column (flag) that should indicate 1 on rows if the two recent dates have either 80 or 81. If not 0.

For Student 1001, it should be 0 for all rows because the last two dates are not 80 or 81. it needs to take the last two dates. even though 1001 has 81, the 2nd last date has 20, so the flag needs to be 0 for both

Desired Output :

╔═══════════╦══════════╦══════╦══════╗
║ studentid ║   date   ║ days ║ flag ║
╠═══════════╬══════════╬══════╬══════╣
║      1000 ║ 2/1/2017 ║   25 ║    0 ║
║      1000 ║ 3/8/2017 ║   30 ║    0 ║
║      1000 ║ 4/4/2017 ║   80 ║    1 ║
║      1000 ║ 5/1/2017 ║   81 ║    1 ║
║      1001 ║ 1/1/2017 ║   60 ║    0 ║
║      1001 ║ 2/1/2017 ║   20 ║    0 ║
║      1001 ║ 4/1/2017 ║   81 ║    0 ║
╚═══════════╩══════════╩══════╩══════╝

Assign row numbers with row_number and then get the min and max value of the last 2 rows per studentid. Thereafter, check the conditions with a case expression to assign flag.

select studentid,dt,days
,case when rnum in (1,2) and max_days_latest_2 in (80,81) and min_days_latest_2 in (80,81) then 1 else 0 end as flag
from (select t.*
      ,max(case when rnum in (1,2) then days end) over(partition by studentid) as max_days_latest_2
      ,min(case when rnum in (1,2) then days end) over(partition by studentid) as min_days_latest_2
      from (select t.*,row_number() over(partition by studentid order by dt desc) as rnum
            from tbl t
           ) t
     ) t

For the first two rows you can apply a simple logic, which will result in a single STAT-step in Explain

If the current row is the 1st row: check if this and the following row both contain one of those values

If the current row is the 2nd row: check if this and the previous row both contain one of those values

SELECT studentid, date_, Days,
   CASE Row_Number()
        Over (PARTITION BY studentid
              ORDER BY date DESC)
      WHEN 1 
         THEN CASE WHEN Days IN (80,81)
--                    AND Min(Days) Over (PARTITION BY studentid ORDER BY date DESC ROWS BETWEEN 1 Following AND 1 Following) IN (80,81)
                    AND Lead(Days) Over (PARTITION BY studentid ORDER BY date DESC) IN (80,81)
                   THEN 1
                   ELSE 0
              END
      WHEN 2
         THEN CASE WHEN Days IN (80,81) 
--                    AND Min(Days) Over (PARTITION BY studentid ORDER BY date DESC ROWS BETWEEN 1 Preceding AND 1 Preceding) IN (80,81)
                    AND Lag(Days) Over (PARTITION BY studentid ORDER BY date DESC) IN (80,81)
                   THEN 1
                   ELSE 0
              END
      ELSE 0
   END AS flag
FROM tab

If your Teradata release doesn't support lead / lag use then min syntax instead.

But if you need to apply this logic to >2 rows you need a more generic approach:

SELECT studentid, date, Days,
   -- check if the first n rows contain only searched values
   CASE WHEN x IS NOT NULL THEN Min(x) Over (PARTITION BY studentid) ELSE 0 END AS flag
FROM
 (
   SELECT studentid, date_, Days,
      CASE
         WHEN Row_Number()
              Over (PARTITION BY studentid
                    ORDER BY date DESC) BETWEEN 1 AND 2   -- only for the first n days
         THEN CASE WHEN Days IN (80,81) THEN 1 ELSE 0 END  -- flag the searched values
      END AS x
   FROM tab AS t
 ) AS dt

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM