I have a table error_event_table
, with 3 columns:
eventtime timestamp;
url varchar2(1024);
errorcount number(30);
the table is having more than 3 million data, I need to find the top n(100) URL's based on the difference of errorcount column value for a given start time and endtime. for ex: table data as below
eventtime | url |errorcount
2018-01-29 10:20:00 | url1.com | 950
2018-01-29 10:25:00 | url1.com | 1000
2018-01-29 10:20:00 | url2.com | 100
2018-01-29 10:25:00 | url2.com | 400
2018-01-29 10:25:00 | url3.com | 500
2018-01-29 10:10:00 | url35.com | 500
when startTime=2018-01-29 10:20:00 and endTime= 2018-01-29 10:25:00 are passed as inputs to the query, the expected output is:
eventtime | url |errorcount
2018-01-29 10:25:00 | url3.com | 500
2018-01-29 10:25:00 | url2.com | 400
2018-01-29 10:20:00 | url2.com | 100
2018-01-29 10:25:00 | url1.com | 1000
2018-01-29 10:20:00 | url1.com | 950
the query should order the records based on the difference of errorcount number at given start time and end time (inputs for the query) descending and limit the results to top 100. To say in other way, the query should find the top 100 URL's with the max difference at end time and start time, and result the corresponding URL's records at both start time and end time.
it is possible that an URL exists only at end time and not at start time, in that case the start time errorcount should be taken as 0. similarly an URL might exists only at start time in which case the diff will be negative number and i don't want these -ve diff records in my results.
I have tried two approaches and not able to get the proper approach to procede furthur.
Approach 1: Using Group By
SELECT url,
Max(eventtime),
Max(errorcount)
FROM error_event_table
WHERE eventtime IN ( To_date(:startTime, 'yyyymmddHH24MISS'),
To_date(:endTime, 'yyyymmddHH24MISS')
)
GROUP BY url
ORDER BY Max(errorcount)DESC;
Approch 2: Using Self Join
SELECT t2.url eurl,
t1.url surl,
t2.eventtime endtime,
t1.eventtime starttime,
( t2.errorcount - t1.errorcount ) diff
FROM error_event_table t1,
error_event_table t2
WHERE ( t1.eventtime = To_date(:startTime, 'yyyymmddHH24MISS')
OR t2.eventtime = To_date(:endTime, 'yyyymmddHH24MISS') )
AND t2.url (+) = t1.url
ORDER BY ( t2.errorcount - t1.errorcount ) DESC
Pls provides inputs on how to approach for solving this problem..
If I understand correctly, you want to use lag()
:
select t.*,
(error_count -
lag(error_count, 1, 0) over (partition by url order by eventtime)
) as diff
from t
where <date conditions here>
order by diff desc;
EDIT:
If you just want the URLs with the maximums, then:
select t.*
from (select t.*, row_number() over (partition by url order by diff desc) as seqnum
from (select t.*,
(error_count -
lag(error_count, 1, 0) over (partition by url order by eventtime)
) as diff
from t
where <date conditions here>
) t
) t
where seqnum <= 100
order by diff desc;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.