I have a project management app in our organization from which we are getting more than 10ks of records every data in a scheduled process. So in our first database we were storing all these 10K records every day doesn't matter there is any changes in the data. So eventually the table getting bigger and bigger.
So its like this
Extraction Log
LogID ExtractionTime
-------------------------
1 15/01/2021
2 16/01/2021
3 17/01/2021
....
...
..
--------------------------------
PCode PName LogId
------------------------------
P1234 Project1 1
P5734 Project2 1
P2785 Project3 1
P5854 Project4 1
P6985 Project5 1
P4748 Project6 1
P2233 Project7 1
P1234 Project1 2
P5734 Project2 2
P2785 Project3 2
P5854 Project4_upd 2
P6985 Project5 2
P4748 Project6 2
P2233 Project7 2
P1234 Project1_upd 3
P2785 Project3 3
P5854 Project4 3
P6985 Project5 3
P4748 Project6 3
P2233 Project7 3
P8464 Project8_New 3
.....
...
..
---------------------------------------
Here data getting duplicated for nothing and makes this very tough to manage,
So we have created a new way for handling that like this and in the above example you can see there was very small changes happened in 3 days of extraction
PCode PName LogId
------------------------------
P1234 Project1 1 N
P5734 Project2 1 N
P2785 Project3 1 N
P5854 Project4 1 N
P6985 Project5 1 N
P4748 Project6 1 N
P2233 Project7 1 N
P5854 Project4_upd 2 U
P1234 Project1_upd 3 U
P8464 Project8_New 3 N
P5734 Project2 3 R
.....
...
..
---------------------------------------
Here as you can see we are adding records that changed in last one day and specify that change as well as N, U or R.
Also concept-wise what we need is to take all data of a specific date we choose.
For example if I choose 17/01/2021 it should return this
PCode PName LogId
------------------------------
P1234 Project1 1 N
P5734 Project2 1 N
P2785 Project3 1 N
P6985 Project5 1 N
P4748 Project6 1 N
P2233 Project7 1 N
P5854 Project4_upd 2 U
---------------------------------
Which means its taking all the records as of that date even when there is only one record for that logID. The same way for logId 3 as well.
But honestly I didn't get a best way to handle this in query because in this sample there are only few data. But in real when time moves there should be lot and so how can we take data of a date in the best way.
Think this is what you want.
Assuming that LogId
is incremental, you can based on the required date, get the LogId
and then get the result. Using row_number
to only list the latest LogId
per PCode
select *
from
(
select p.PCode, p.PName, p.LogId, p.Change,
rn = row_number() over (partition by p.PCode order by p.LogId desc)
from Log l
inner join Project p on p.LogId < l.LogId
where l.ExtractionTime = '2021-01-17'
) p
where p.rn = 1
order by PCode
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.