How to “Roll-Up” data across multiple columns and rows

Question

I have an Audit table where we record changes to fields in our database. I have a query where I was able to get a subset of the data from the Audit regarding a few columns, their recorded change, and when, associated against the applicable ID's. Here is a sample of what the output looks like:

ID      ada       IsHD  HDF   DTStamp
-----------------------------------------------------
68      NULL      0     0     2020-04-28 21:12:21.287
68      NULL      NULL  NULL  2020-04-17 14:59:49.700
68      No/Unsure NULL  NULL  2020-04-17 14:03:46.160
68      NULL      0     0     2020-04-17 13:49:49.720
102     NULL      NULL  NULL  2020-04-30 13:11:15.273
102     No/Unsure NULL  NULL  2020-04-20 16:00:35.410
102     NULL      1     1     2020-04-20 15:59:55.750
105     No/Unsure 1     1     2020-04-17 12:06:10.833
105     NULL      NULL  NULL  2020-04-13 07:51:30.180
126     NULL      NULL  NULL  2020-05-01 17:59:24.460
126     NULL      0     0     2020-04-28 21:12:21.287

What I am trying to figure out is the most efficient means to "roll-up" the multiple rows of a given ID so that the newest Non-NULL value is kept, leaving only a single line for that ID.

That is, turn this:

68      NULL      0     0     2020-04-28 21:12:21.287
68      NULL      NULL  NULL  2020-04-17 14:59:49.700
68      No/Unsure NULL  NULL  2020-04-17 14:03:46.160
68      NULL      0     0     2020-04-17 13:49:49.720
102     NULL      NULL  NULL  2020-04-30 13:11:15.273
102     No/Unsure NULL  NULL  2020-04-20 16:00:35.410
102     NULL      1     1     2020-04-20 15:59:55.750

Into this:

68      No/Unsure 0     0     2020-04-28 21:12:21.287
102     No/Unsure 1     1     2020-04-30 13:11:15.273

...and so on down the list. It's almost like you were to push down on the top of the results and squeeze out all the NULLs, as it were.

Dumping the above results into a temp table @audit I then run the following query:

SELECT DISTINCT a.[ID]
     , (SELECT TOP 1 [ADA]
        FROM @audit
        WHERE [ID] = a.[ID]
          AND [ADA] IS NOT NULL
        ORDER BY [DTStamp] DESC) AS 'ADA'
     , (SELECT TOP 1 [IsHD]
        FROM @audit
        WHERE [ID] = a.[ID]
          AND [IsHD] IS NOT NULL
        ORDER BY [DTStamp] DESC) AS 'IsHD'
     , (SELECT TOP 1 [HDF]
        FROM @audit
        WHERE [ID] = a.[ID]
          AND [HDF] IS NOT NULL
        ORDER BY [DTStamp] DESC) AS 'HDF'
     , (SELECT Max([DTStamp])
        FROM @audit
        WHERE [ID] = a.[ID]) AS 'DTStamp'
FROM @audit a
ORDER BY [ID]

This is what I've come up with and it does work, but it feels very klunky and inefficient. Is there a better way to accomplish the end goal?

Answer 1

If you want one row per id, then use aggregation:

select id, max(ada), max(IsHD), max(HDF), max(DTStamp)
from @audit a
group by id;

This works for the data you have provided and seems to fit the rule that you want.

Answer 2

I understand that you want the "latest" non-null value per id for each column, using column DTStamp for ordering.

Your approach using multiple subqueries does what you want would. An alternative be to use multiple row_number() s and conditional aggregation. This might actually be more efficient, since it avoids multiple scans on the table.

select
    id,
    max(case when rn_ada  = 1 then ada  end) ada,
    max(case when rn_isHd = 1 then isHd end) isHd,
    max(case when rn_hdf  = 1 then hdf  end) hdf,
    max(DTStamp) DTStamp
from (
    select 
        a.*,
        row_number() over(
            partition by id
            order by case when ada is not null then DTStamp end desc
        ) rn_ada,
        row_number() over(
            partition by id
            order by case when isHd is not null then DTStamp end desc
        ) rn_isHd,
        row_number() over(
            partition by id
            order by case when hdf is not null then DTStamp end desc
        ) rn_hdf
    from @audit a
) t
group by id
order by id

Demo on DB Fiddle :

 id | ada       | isHd | hdf | DTStamp                
--: | :-------- | ---: | --: | :----------------------
 68 | No/Unsure |    0 |   0 | 2020-04-28 21:12:21.287
102 | No/Unsure |    1 |   1 | 2020-04-30 13:11:15.273

How to “Roll-Up” data across multiple columns and rows

Question

2 answers

solution1
2 2020-05-21 19:12:21

solution2
0 2020-05-22 02:40:25

How to “Roll-Up” data across multiple columns and rows

Question

2 answers

solution1 2 2020-05-21 19:12:21

solution2 0 2020-05-22 02:40:25

solution1
2 2020-05-21 19:12:21

solution2
0 2020-05-22 02:40:25