简体   繁体   中英

Oracle: Ordering/sorting records based on the difference value of a column data in different rows of same table

I have a table error_event_table , with 3 columns:

eventtime timestamp;
url varchar2(1024);
errorcount number(30);

the table is having more than 3 million data, I need to find the top n(100) URL's based on the difference of errorcount column value for a given start time and endtime. for ex: table data as below

eventtime           | url      |errorcount
2018-01-29 10:20:00 | url1.com | 950
2018-01-29 10:25:00 | url1.com | 1000
2018-01-29 10:20:00 | url2.com | 100
2018-01-29 10:25:00 | url2.com | 400
2018-01-29 10:25:00 | url3.com | 500
2018-01-29 10:10:00 | url35.com | 500

when startTime=2018-01-29 10:20:00 and endTime= 2018-01-29 10:25:00 are passed as inputs to the query, the expected output is:

eventtime           | url      |errorcount

2018-01-29 10:25:00 | url3.com | 500
2018-01-29 10:25:00 | url2.com | 400
2018-01-29 10:20:00 | url2.com | 100
2018-01-29 10:25:00 | url1.com | 1000
2018-01-29 10:20:00 | url1.com | 950

the query should order the records based on the difference of errorcount number at given start time and end time (inputs for the query) descending and limit the results to top 100. To say in other way, the query should find the top 100 URL's with the max difference at end time and start time, and result the corresponding URL's records at both start time and end time.

it is possible that an URL exists only at end time and not at start time, in that case the start time errorcount should be taken as 0. similarly an URL might exists only at start time in which case the diff will be negative number and i don't want these -ve diff records in my results.

I have tried two approaches and not able to get the proper approach to procede furthur.

Approach 1: Using Group By

SELECT url, 
       Max(eventtime), 
       Max(errorcount) 
FROM   error_event_table 
WHERE  eventtime IN ( To_date(:startTime, 'yyyymmddHH24MISS'), 
                      To_date(:endTime, 'yyyymmddHH24MISS') 
                           ) 
GROUP  BY url 
ORDER  BY Max(errorcount)DESC; 

Approch 2: Using Self Join

SELECT t2.url                            eurl, 
       t1.url                            surl, 
       t2.eventtime                      endtime, 
       t1.eventtime                      starttime, 
       ( t2.errorcount - t1.errorcount ) diff 
FROM   error_event_table t1, 
       error_event_table t2 
WHERE  ( t1.eventtime = To_date(:startTime, 'yyyymmddHH24MISS') 
          OR t2.eventtime = To_date(:endTime, 'yyyymmddHH24MISS') ) 
       AND t2.url (+) = t1.url 
ORDER  BY ( t2.errorcount - t1.errorcount ) DESC 

Pls provides inputs on how to approach for solving this problem..

If I understand correctly, you want to use lag() :

select t.*,
       (error_count - 
        lag(error_count, 1, 0) over (partition by url order by eventtime)
       ) as diff
from t
where <date conditions here>
order by diff desc;

EDIT:

If you just want the URLs with the maximums, then:

select t.*
from (select t.*, row_number() over (partition by url order by diff desc) as seqnum
      from (select t.*,
                   (error_count - 
                    lag(error_count, 1, 0) over (partition by url order by eventtime)
                   ) as diff
            from t
            where <date conditions here>
           ) t
     ) t
where seqnum <= 100
order by diff desc;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM