简体   繁体   中英

Read records from a table with Optimised Historical Data

I have a project management app in our organization from which we are getting more than 10ks of records every data in a scheduled process. So in our first database we were storing all these 10K records every day doesn't matter there is any changes in the data. So eventually the table getting bigger and bigger.

So its like this

Extraction Log

LogID   ExtractionTime
-------------------------
1        15/01/2021
2        16/01/2021
3        17/01/2021
....
...
..
--------------------------------


PCode       PName         LogId
------------------------------
P1234       Project1      1
P5734       Project2      1
P2785       Project3      1
P5854       Project4      1
P6985       Project5      1
P4748       Project6      1
P2233       Project7      1
P1234       Project1      2
P5734       Project2      2
P2785       Project3      2
P5854       Project4_upd  2
P6985       Project5      2
P4748       Project6      2
P2233       Project7      2
P1234       Project1_upd  3
P2785       Project3      3
P5854       Project4      3
P6985       Project5      3
P4748       Project6      3
P2233       Project7      3
P8464       Project8_New  3
.....
...
..
---------------------------------------

Here data getting duplicated for nothing and makes this very tough to manage,

So we have created a new way for handling that like this and in the above example you can see there was very small changes happened in 3 days of extraction

PCode       PName         LogId
------------------------------
P1234       Project1      1    N
P5734       Project2      1    N
P2785       Project3      1    N
P5854       Project4      1    N
P6985       Project5      1    N
P4748       Project6      1    N
P2233       Project7      1    N
P5854       Project4_upd  2    U
P1234       Project1_upd  3    U
P8464       Project8_New  3    N
P5734       Project2      3    R
.....
...
..
---------------------------------------

Here as you can see we are adding records that changed in last one day and specify that change as well as N, U or R.

Also concept-wise what we need is to take all data of a specific date we choose.

For example if I choose 17/01/2021 it should return this

PCode       PName         LogId
------------------------------
P1234       Project1      1    N
P5734       Project2      1    N
P2785       Project3      1    N
P6985       Project5      1    N
P4748       Project6      1    N
P2233       Project7      1    N
P5854       Project4_upd  2    U
---------------------------------

Which means its taking all the records as of that date even when there is only one record for that logID. The same way for logId 3 as well.

But honestly I didn't get a best way to handle this in query because in this sample there are only few data. But in real when time moves there should be lot and so how can we take data of a date in the best way.

Think this is what you want.

Assuming that LogId is incremental, you can based on the required date, get the LogId and then get the result. Using row_number to only list the latest LogId per PCode

select *
from
(
    select p.PCode, p.PName, p.LogId, p.Change, 
           rn = row_number() over (partition by p.PCode order by p.LogId desc)
    from   Log l
           inner join Project p  on p.LogId < l.LogId
    where  l.ExtractionTime = '2021-01-17'
) p
where p.rn = 1
order by PCode

demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM