简体   繁体   中英

How to identify changed values using a SQL Server temporal table?

I have a SQL Azure table and I have turned on the new Temporal Table feature (New to SQL Server 2016 and SQL Azure v12). This feature creates another table to track all changes to the primary table (I included a link to the docs about temporal tables at the bottom of my question). You can use the special query language to get this history. Note the FOR SYSTEM_TIME ALL in the following query:

SELECT 
    ValidFrom
    , ValidTo
    , ShiftId
    , TradeDate
    , StatusID
    , [LastActionDate]
    , [OwnerUserID]
    , [WorkerUserID]
    , [WorkerEmail]
    , [Archived]
FROM [KrisisShifts_ShiftTrade] 
FOR SYSTEM_TIME ALL
WHERE [ShiftID] = 27
ORDER BY ValidTo Desc

The result set looks like this:

ValidFrom                   ValidTo                     ShiftId     TradeDate  StatusID    LastActionDate          OwnerUserID WorkerUserID WorkerEmail                                        Archived
--------------------------- --------------------------- ----------- ---------- ----------- ----------------------- ----------- ------------ -------------------------------------------------- --------
2017-06-21 00:26:44.51      9999-12-31 23:59:59.99      27          2017-01-27 3           2017-01-09 16:23:39.760 45          34           test@hotmail.com                                   1
2017-06-21 00:19:35.57      2017-06-21 00:26:44.51      27          2017-01-27 2           2017-01-09 16:23:39.760 45          34           test@hotmail.com                                   1
2017-06-21 00:19:16.25      2017-06-21 00:19:35.57      27          2017-01-28 3           2017-01-09 16:23:39.760 45          34           test@hotmail.com                                   1

Using the SYSTEM_TIME FOR ALL The temporal Table returns the current record from the primary table, which is the first one, and the remaining records are previous versions of that record stored in the tracking table. (you can see the validFrom and ValidTo columns, obviously the time the record was the current record) In this case, the tracking table that keeps the historical records is called KrisisShifts_ShiftTrade_History

WHAT I WANT:

I want to build a query that just highlights the changes made at each historical point. Notice that the second record has a different StatusID and that the Third record has a different TradeDate

I want to product a result set like below (I imagne I will ignore the first or current record because it obviously is not chnaged):

DESIRED RESULT :

ShiftId      Column          Value             ValidFrom                   ValidTo
----------  -------------  ------------------- --------------------------- --------------------------
27          StatusId       2                   2017-06-21 00:19:35.57      2017-06-21 00:26:44.51
27          TradeDate      2017-01-28          2017-06-21 00:19:35.57      2017-06-21 00:26:44.51   

I am not sure how to accomplish this. Or I am open to another solution. I want to be able to quickly view the changes for each record compared to the original record.

I tried to unpivot the results to compare them, but I was unable to get that to work because the shift ID is the same for every row. I would love to show more work here, but I am really stuck.

EDIT 1:

I have been able to isolate the changes for just one column in the following query using lag(). I could union this query with a similar one for each column I want to track, however, this is a lot of work and has to be built for each table. Is there a way to do this dynamically so it detects the columns automatically?

StatusID change history query:(I isolate the records to a shiftId of 27 just for testing)

SELECT 'SHIFT STATUS'  as ColumnName, t1.RecVersion, t1.ShiftID, t1.ValidFrom, t1.ValidTo, t1.StatusId
, (SELECT [Title] FROM [dbo].[KrisisShifts_Status] WHERE [dbo].[KrisisShifts_Status].[StatusID] = t1.StatusId) AS RecStatus
FROM
    (SELECT TOP 100 PERCENT 
        ROW_NUMBER() OVER(PARTITION BY ShiftId ORDER BY ValidTo ASC) AS RecVersion -- reverse sorting the ValidTo date gives "version count" to column changes
        , t2.ValidTo
        , t2.ValidFrom
        , t2.ShiftID
        , t2.StatusId
        , LAG(StatusId,1,0) OVER (ORDER BY ValidTo DESC) AS PrevStatusId
    FROM [KrisisShifts_ShiftTrade] 
    FOR SYSTEM_TIME ALL AS t2

    ORDER BY t2.ValidTo Desc
    ) AS t1
WHERE
    (t1.StatusId <> t1.PrevStatusId)
    AND
    SHIFTID = 27
ORDER BY t1.ValidTo DESC

RESULTS of query:

ColumnName   RecVersion           ShiftID     ValidFrom                   ValidTo                     StatusId    RecStatus
------------ -------------------- ----------- --------------------------- --------------------------- ----------- --------------------------------------------------
SHIFT STATUS 3                    27          2017-06-21 00:26:44.51      2017-06-25 14:09:32.37      3           Confirmed
SHIFT STATUS 2                    27          2017-06-21 00:19:35.57      2017-06-21 00:26:44.51      2           Reserved
SHIFT STATUS 1                    27          2017-06-21 00:19:16.25      2017-06-21 00:19:35.57      3           Confirmed

END EDIT 1:

QUESTION:

Can someone help me isolate just the changed data in columns from the previous record for each shiftId in the temporal table result set?

Thanks in advance

EDIT # 2:

The following is a list of all the columns I want to "watch for changes" from this table:

[TradeDate] [StatusID] [LastActionDate] [AllowedRankID] [OwnerUserID] [OwnerEmail] [OwnerLocationID] [OwnerRankID] [OwnerEmployeeID] [WorkerUserID] [WorkerEmail] [WorkerLocationID] [WorkerRankID] [WorkerPlatoonID] [WorkerEmployeeID] [IsPartialShift] [Detail] [LastModifiedByUserID] [Archived] [UpdatedDate]

END EDIT 2:

NOTE ON NEW TAG:

I created a new tag for temporal tables as there is not one. The following has the description of them if someone with more reputation wants to add it tot he tag's details.

MS Docs on Temporal Tables

You can also use CROSS APPLY to UNPIVOT .

It should be noted that the ValidFrom and ValidTo refer to the validity of the row version itself not neccessarily the column value. I believe this is what you are requesting but this may be confusing.

Demo

WITH T
     AS (SELECT ValidFrom,
                ValidTo,
                ShiftId,
                TradeDate,
                StatusID,
                LastActionDate,
                OwnerUserID,
                WorkerUserID,
                WorkerEmail,
                Archived,
                nextTradeDate = LEAD(TradeDate) OVER (PARTITION BY ShiftId ORDER BY ValidFrom),
                nextStatusID = LEAD(StatusID) OVER (PARTITION BY ShiftId ORDER BY ValidFrom),
                nextLastActionDate = LEAD(LastActionDate) OVER (PARTITION BY ShiftId ORDER BY ValidFrom),
                nextOwnerUserID = LEAD(OwnerUserID) OVER (PARTITION BY ShiftId ORDER BY ValidFrom),
                nextWorkerUserID = LEAD(WorkerUserID) OVER (PARTITION BY ShiftId ORDER BY ValidFrom),
                nextWorkerEmail = LEAD(WorkerEmail) OVER (PARTITION BY ShiftId ORDER BY ValidFrom),
                nextArchived = LEAD(Archived) OVER (PARTITION BY ShiftId ORDER BY ValidFrom)
         FROM   KrisisShifts_ShiftTrade)
SELECT ShiftId,
       Colname AS [Column],
       value,
       ValidFrom,
       ValidTo
FROM   T
       CROSS APPLY ( VALUES 
                    ('TradeDate', CAST(TradeDate AS NVARCHAR(4000)), CAST(nextTradeDate AS NVARCHAR(4000))),
                    ('StatusID', CAST(StatusID AS NVARCHAR(4000)), CAST(nextStatusID AS NVARCHAR(4000))),
                    ('LastActionDate', CAST(LastActionDate AS NVARCHAR(4000)), CAST(nextLastActionDate AS NVARCHAR(4000))),
                    ('OwnerUserID', CAST(OwnerUserID AS NVARCHAR(4000)), CAST(nextOwnerUserID AS NVARCHAR(4000))),
                    ('WorkerUserID', CAST(WorkerUserID AS NVARCHAR(4000)), CAST(nextWorkerUserID AS NVARCHAR(4000))),
                    ('WorkerEmail', CAST(WorkerEmail AS NVARCHAR(4000)), CAST(nextWorkerEmail AS NVARCHAR(4000))),
                    ('Archived', CAST(Archived AS NVARCHAR(4000)), CAST(nextArchived AS NVARCHAR(4000)))
                   ) CA(Colname, value, nextvalue)
WHERE  EXISTS(SELECT value
              EXCEPT
              SELECT nextvalue)
       AND ValidTo <> '9999-12-31 23:59:59'
ORDER  BY ShiftId,
          [Column],
          ValidFrom;

If you did want the validity at column level you could use ( Demo )

WITH T1 AS
(
SELECT *, 
       ROW_NUMBER() OVER (PARTITION BY ShiftId, colname ORDER BY ValidFrom)  
       - ROW_NUMBER() OVER (PARTITION BY ShiftId, colname, Colvalue ORDER BY ValidFrom)  AS Grp,
       IIF(DENSE_RANK() OVER (PARTITION BY ShiftId, colname ORDER BY Colvalue) + 
        DENSE_RANK() OVER (PARTITION BY ShiftId, colname ORDER BY Colvalue DESC) = 2, 0,1) AS HasChanges
FROM KrisisShifts_ShiftTrade
       CROSS APPLY ( VALUES 
                    ('TradeDate', CAST(TradeDate AS NVARCHAR(4000))),
                    ('StatusID', CAST(StatusID AS NVARCHAR(4000))),
                    ('LastActionDate', CAST(LastActionDate AS NVARCHAR(4000))),
                    ('OwnerUserID', CAST(OwnerUserID AS NVARCHAR(4000))),
                    ('WorkerUserID', CAST(WorkerUserID AS NVARCHAR(4000))),
                    ('WorkerEmail', CAST(WorkerEmail AS NVARCHAR(4000))),
                    ('Archived', CAST(Archived AS NVARCHAR(4000)))
                   ) CA(Colname, Colvalue)
)
SELECT  ShiftId, colname, Colvalue, MIN(ValidFrom) AS ValidFrom, MAX(ValidTo) AS ValidTo
FROM T1
WHERE HasChanges = 1
GROUP BY ShiftId, colname, Colvalue, Grp
ORDER  BY ShiftId,
          colname,
          ValidFrom;

This will certainly not be the best performing way but meets the requirement

Is there a way to do this dynamically so it detects the columns automatically?

Demo

WITH k
     AS (SELECT *,
                ROW_NUMBER() OVER (PARTITION BY ShiftId ORDER BY ValidFrom) AS _RN
         FROM   KrisisShifts_ShiftTrade
        /*FOR SYSTEM_TIME ALL*/
        ),
     T
     AS (SELECT k.*,
                _colname = n.n.value('local-name(.)[1]', 'sysname'),
                _colvalue = n.n.value('text()[1]', 'nvarchar(4000)')
         FROM   k
                CROSS apply (SELECT (SELECT k.*
                                     FOR xml path('row'), elements xsinil, type)) ca(x)
                CROSS APPLY x.nodes('/row/*[not(self::_RN or self::ValidFrom or self::ValidTo)]') n(n))
SELECT T.ShiftId,
       T._colname  AS [Column],
       T._colvalue AS value,
       t.ValidFrom,
       T.ValidTo
FROM   T T
       INNER JOIN T Tnext
         ON Tnext._RN = T._RN + 1
            AND T.ShiftId = Tnext.ShiftId
            AND T._colname = Tnext._colname
WHERE  EXISTS(SELECT T._colvalue
              EXCEPT
              SELECT Tnext._colvalue)
ORDER  BY ShiftId,
          [Column],
          ValidFrom;

Method

Would suggest using a stored procedure that loops through the rows using a cursor and builds up the results in a temporary table. (Since there are a manageable number of columns here I'd suggest doing the comparisons of each column value manually rather than attempting to do it dynamically as the latter would be more complex.)

Demo

Rextester demo: http://rextester.com/EEELN72555

Stored procedure SQL

CREATE PROCEDURE GetChanges(@RequestedShiftID INT)
AS
BEGIN
    DECLARE @ValidFrom DATETIME, @ValidTo DATETIME, @TradeDate DATETIME; 
    DECLARE @PrevTradeDate DATETIME, @LastActionDate DATETIME;
    DECLARE @PrevLastActionDate DATETIME;
    DECLARE @ShiftId INT, @StatusID INT, @PrevStatusID INT, @OwnerUserID INT;
    DECLARE @PrevOwnerUserID INT, @WorkerUserID INT, @PrevWorkerUserID INT;
    DECLARE @Archived INT, @PrevArchived INT;
    DECLARE @WorkerEmail VARCHAR(MAX), @PrevWorkerEmail VARCHAR(MAX);

    CREATE TABLE #Results (Id INT NOT NULL IDENTITY (1,1) PRIMARY KEY, ShiftId INT,
                           [Column] VARCHAR(255), Value VARCHAR(MAX), 
                           ValidFrom DATETIME, ValidTo DATETIME);

    DECLARE cur CURSOR FOR
    SELECT 
        ValidFrom
        , ValidTo
        , ShiftId
        , TradeDate
        , StatusID
        , [LastActionDate]
        , [OwnerUserID]
        , [WorkerUserID]
        , [WorkerEmail]
        , [Archived]
    FROM [KrisisShifts_ShiftTrade]
    FOR SYSTEM_TIME ALL
    WHERE [ShiftID] = @RequestedShiftID
    ORDER BY ValidTo Desc;

    OPEN cur;
    FETCH NEXT FROM cur INTO
        @ValidFrom
        , @ValidTo
        , @ShiftId
        , @TradeDate
        , @StatusID
        , @LastActionDate
        , @OwnerUserID
        , @WorkerUserID
        , @WorkerEmail
        , @Archived;

    WHILE @@FETCH_STATUS = 0
    BEGIN
       SET @PrevTradeDate = @TradeDate;
       SET @PrevStatusID = @StatusID;
       SET @PrevLastActionDate = @LastActionDate;
       SET @PrevOwnerUserID = @OwnerUserID;
       SET @PrevWorkerUserID = @WorkerUserID;
       SET @PrevWorkerEmail = @WorkerEmail;
       SET @PrevArchived = @Archived;

       FETCH NEXT FROM cur INTO
            @ValidFrom
            , @ValidTo
            , @ShiftId
            , @TradeDate
            , @StatusID
            , @LastActionDate
            , @OwnerUserID
            , @WorkerUserID
            , @WorkerEmail
            , @Archived;

       IF @TradeDate <> @PrevTradeDate
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'TradeDate', @TradeDate, @ValidFrom, @ValidTo);
       IF @StatusID <> @PrevStatusID
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'StatusID', @StatusID, @ValidFrom, @ValidTo);
       IF @LastActionDate <> @PrevLastActionDate
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'LastActionDate', @LastActionDate, @ValidFrom, @ValidTo);
       IF @OwnerUserID <> @PrevOwnerUserID
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'OwnerUserID', @OwnerUserID, @ValidFrom, @ValidTo);
       IF @WorkerUserID <> @PrevWorkerUserID
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'WorkerUserID', @WorkerUserID, @ValidFrom, @ValidTo);
       IF @WorkerEmail <> @PrevWorkerEmail
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'WorkerEmail', @WorkerEmail, @ValidFrom, @ValidTo);
       IF @Archived <> @PrevArchived
           INSERT INTO #Results (ShiftId, [Column], Value, ValidFrom, ValidTo)
           VALUES (@ShiftId, 'WorkerEmail', @WorkerEmail, @ValidFrom, @ValidTo);
    END   

    CLOSE cur;
    DEALLOCATE cur;

    SELECT ShiftId, [Column], Value, ValidFrom, ValidTo
    FROM #Results
    ORDER BY Id
END;

Note: The above only includes the columns that were in the example in the question. The list of columns that could change in the recent edit was wider than this but the others could of course be added in just the same way.

-- Very interesting question.

-- Think about your desired result - column "Value" should contain values of different types (int, decimal, date, binary, varchar, ...). So you need convert values to varchar, or use sqlvariant, or binary. Then on some point you will need to recognize type of value and process it differently for the different rows

-- To get values you may try to use UNPIVOT:

SELECT someRowID, ValidTo, ValidFrom, col, val
FROM 
    (SELECT someRowID, ValidTo, ValidFrom /*, ... */,
            [TradeDate], [StatusID], [LastActionDate], [AllowedRankID], [OwnerUserID], [OwnerEmail], [OwnerLocationID], [OwnerRankID], [OwnerEmployeeID], [WorkerUserID], [WorkerEmail], [WorkerLocationID], [WorkerRankID], [WorkerPlatoonID], [WorkerEmployeeID], [IsPartialShift], [Detail], [LastModifiedByUserID], [Archived], [UpdatedDate]
     FROM ... ) AS p
UNPIVOT    
    (val FOR col IN ([TradeDate], [StatusID], [LastActionDate], [AllowedRankID], [OwnerUserID], [OwnerEmail], [OwnerLocationID], [OwnerRankID], [OwnerEmployeeID], [WorkerUserID], [WorkerEmail], [WorkerLocationID], [WorkerRankID], [WorkerPlatoonID], [WorkerEmployeeID], [IsPartialShift], [Detail], [LastModifiedByUserID], [Archived], [UpdatedDate])
) AS unpvt

Then similary UNPIVOT previous values

... and join results as

SELECT ...
FROM prevVals
INNER JOIN vals 
        ON vals.someRowID = prevVals.someRowID 
       AND vals.col = prevVals.col
WHERE vals.val <> prevVals.val        -- yes, I know here can be a problem (NULLs, types)

This is just an idea and I hope it will help

Try not to use temporal Table feature:). Try trigger to check changes - it's much more easier and much shorter.

Create image of your table with timestamp and dml type column (row_id, s__dml_dt, s__dml_type + all your columns from source table) for all dml types (i,u,d).

create trigger dbo.KrisisShifts_ShiftTrade on dbo.KrisisShifts_ShiftTrade
after insert as
begin
     insert into dbo.KrisisShifts_ShiftTrade_logtable
     select getdate() s__dml_dt, 'i' s__dml_type, * from inserted
     -- for udpate select getdate() s__dml_dt, 'i' s__dml_type, * from inserted
     -- for delete select getdate() s__dml_dt, 'd' s__dml_type, * from deleted
end

Now after insert/delete/update you can check all your historical values. If you would like pivoted result you can easily create view with pivot for dbo.KrisisShifts_ShiftTrade_logtable.

Script to log all tables in database (it'll create tables with prefix r_).

declare @table sysname
declare @nl varchar(2)
declare @create_table int
declare @cmd varchar(max)
declare @trgname sysname
declare c_tables cursor for
    select table_name,
             case
                when exists (
                  select 2
                     from information_schema.tables
                    where table_name = 'r_'+ot.table_name
                  ) then 0
                else 1
             end create_table
      from information_schema.tables ot 
     where table_type = 'BASE TABLE'
        and table_name not like 'r[_]%'
        --and table_name like @tblfilter

open c_tables
fetch next from c_tables into @table,@create_table
while @@fetch_status=0
begin
   -- logovaci tabulka
    if @create_table=1
    begin
        set @cmd = 'create table r_'+@table+'(s__row_id int not null identity(1,1),s__dml_dt datetime not null,s__dml_type char(1) not null'
        select @cmd = @cmd + char(13)+char(10)+','+column_name+' '+data_type+isnull('('+case when character_maximum_length<0 then 'max' else cast(character_maximum_length as varchar) end+')','')+' null' from information_schema.columns where table_name=@table order by ordinal_position
        set @cmd = @cmd + ')'
        exec(@cmd)

        exec('create index i_s__dml_dt on r_'+@table+' (s__dml_dt)')
    end

    -- delete trigger
    set @trgname = 'trg_'+@table+'_dl_del'
    if object_id(@trgname) is not null exec('drop trigger '+@trgname)
    exec('
        create trigger '+@trgname+' on '+@table+' after delete as
        begin
          insert into r_'+@table+' select getdate(),''d'',t.* from deleted t
        end
    ')
    -- insert trigger
    set @trgname = 'trg_'+@table+'_dl_ins'
    if object_id(@trgname) is not null exec('drop trigger '+@trgname)
    exec('
        create trigger '+@trgname+' on '+@table+' after insert as
        begin
          insert into r_'+@table+' select getdate(),''i'',t.* from inserted t
        end
    ')
    -- update trigger
    set @trgname = 'trg_'+@table+'_dl_upd'
    if object_id(@trgname) is not null exec('drop trigger '+@trgname)
    exec('
        create trigger '+@trgname+' on '+@table+' after update as
        begin
          insert into r_'+@table+' select getdate(),''u'',t.* from deleted t
        end
    ')


    fetch next from c_tables into @table,@create_table
end
close c_tables
deallocate c_tables

How much storage space do you have?

The last time I did something like this we inserted new rows for each changed column in a separate change log table. We did it using client-side logic, but you could get the same effect with a trigger.

This takes up a lot of room and slows down your writes, but it does give you fast read access to the change log.

PS We didn't have a general solution, so we only did it for the one table that needed UI support. Everything else used pseudo-temporal tables. (Old version of SQL Server.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM