简体   繁体   中英

How to compare rows in a table and keep the highest values in the row in Oracle

I am looking for a query which will compare the current row value with previous row value and if the percentage of different between current and previous is less than 10% then keep the previous value. I am sure this can be achievable using oracle lag functions but I am unable to find the exact solution. I have tried below query but it is not keeping the previous value for future rows. Any help on this would be greatly appreciated.

I have used the below query to fetch the results but it didn't solve my problem.

select /*+ parallel(64) */ a, b, c, datevalue, pricevalue, 
        lag(pricevalue,1,0) over (partition by a, b, c order by a, b, c, datevalue) as prev_pricevalue,
        (pricevalue - lag(pricevalue,1,0) over (partition by a, b, c order by a, b, c, datevalue))/pricevalue as diff,
        case 
            when (pricevalue - lag(pricevalue,1,0) over (partition by a, b, c order by a, b, c, datevalue))/pricevalue
                  < 0.1 then lag(pricevalue,1,0) over (partition by a, b, c order by a, b, c, datevalue)
            else pricevalue
            end new_pricevalue
      from table1
      where datevalue between '18-MAY-2019' and '31-MAY-2019';

I have data like below. Column names are, A,B,C,DATE and VALUE

A               B       C       DATE        VALUE
16587EA_1005    RETAIL  7207    18/05/2019  7.04
16587EA_1005    RETAIL  7207    19/05/2019  7.04
16587EA_1005    RETAIL  7207    20/05/2019  7.04
16587EA_1005    RETAIL  7207    21/05/2019  7.04
16587EA_1005    RETAIL  7207    22/05/2019  7.04
16587EA_1005    RETAIL  7207    23/05/2019  7
16587EA_1005    RETAIL  7207    24/05/2019  7
16587EA_1005    RETAIL  7207    25/05/2019  7
16587EA_1005    RETAIL  7207    26/05/2019  7
16587EA_1005    RETAIL  7207    27/05/2019  7
16587EA_1005    RETAIL  7207    28/05/2019  7
16587EA_1005    RETAIL  7207    29/05/2019  8
16587EA_1005    RETAIL  7207    30/05/2019  8
16587EA_1005    RETAIL  7207    31/05/2019  8
16587EA_1005    RETAIL  7207    01/06/2019  8.05
16587EA_1005    RETAIL  7207    02/06/2019  8.05
16587EA_1005    RETAIL  7207    03/06/2019  8.05

And, I want the output like below.

A               B       C       DATE        VALUE
16587EA_1005    RETAIL  7207    18/05/2019  7.04
16587EA_1005    RETAIL  7207    19/05/2019  7.04
16587EA_1005    RETAIL  7207    20/05/2019  7.04
16587EA_1005    RETAIL  7207    21/05/2019  7.04
16587EA_1005    RETAIL  7207    22/05/2019  7.04
16587EA_1005    RETAIL  7207    23/05/2019  7.04
16587EA_1005    RETAIL  7207    24/05/2019  7.04
16587EA_1005    RETAIL  7207    25/05/2019  7.04
16587EA_1005    RETAIL  7207    26/05/2019  7.04
16587EA_1005    RETAIL  7207    27/05/2019  7.04
16587EA_1005    RETAIL  7207    28/05/2019  7.04
16587EA_1005    RETAIL  7207    29/05/2019  8
16587EA_1005    RETAIL  7207    30/05/2019  8
16587EA_1005    RETAIL  7207    31/05/2019  8
16587EA_1005    RETAIL  7207    01/06/2019  8
16587EA_1005    RETAIL  7207    02/06/2019  8
16587EA_1005    RETAIL  7207    03/06/2019  8

Best Regards MMR

Not exactly a concise method, but a recursive CTE can do this.

WITH CTE AS
(
  -- adding a rank and rownum
  SELECT t.*
  , DENSE_RANK() OVER (ORDER BY a, b, c) AS rnk
  , ROW_NUMBER() OVER (PARTITION BY a, b, c ORDER BY datevalue) rn
  FROM table1 t
),
RCTE (rnk, rn, a, b, c, datevalue, pricevalue) AS
(
  -- seeding the recursion
  SELECT rnk, rn, a, b, c, datevalue, pricevalue
  FROM CTE
  WHERE rn = 1

  UNION ALL

  -- loop through the records for each rank
  SELECT c.rnk, c.rn, c.a, c.b, c.c, c.datevalue,
  CASE 
  WHEN ABS(r.pricevalue - c.pricevalue) / c.pricevalue < 0.1
  THEN r.pricevalue
  ELSE c.pricevalue
  END
  FROM RCTE r
  JOIN CTE c
    ON c.rnk = r.rnk
   AND c.rn = r.rn + 1
)
SELECT * 
FROM RCTE
ORDER BY rnk, rn;

Returns:

\nRNK |  RN | A | B | C |  DATEVALUE | PRICEVALUE\n--: |  -: |  :- |  :- |  :- |  :-------- |  ---------: \n  1 |  1 |  a |  b | c |  18-MAY-19 |  7.04 \n  1 |  2 |  a |  b | c |  19-MAY-19 |  7.04 \n  1 |  3 |  a |  b | c |  20-MAY-19 |  7.04 \n  1 |  4 |  a |  b | c |  21-MAY-19 |  7.04 \n  1 |  5 |  a |  b | c |  22-MAY-19 |  7.04 \n  1 |  6 |  a |  b | c |  23-MAY-19 |  7.04 \n  1 |  7 |  a |  b | c |  24-MAY-19 |  7.04 \n  1 |  8 |  a |  b | c |  25-MAY-19 |  7.04 \n  1 |  9 |  a |  b | c |  26-MAY-19 |  7.04 \n  1 |  10 |  a |  b | c |  27-MAY-19 |  7.04 \n  1 |  11 |  a |  b | c |  28-MAY-19 |  7.04 \n  1 |  12 |  a |  b | c |  29-MAY-19 |  8 \n  1 |  13 |  a |  b | c |  30-MAY-19 |  8 \n  1 |  14 |  a |  b | c |  31-MAY-19 |  8 \n  1 |  15 |  a |  b | c |  01-JUN-19 |  8 \n  1 |  16 |  a |  b | c |  02-JUN-19 |  8 \n

A test on db<>fiddle here

My attempt :) You can detect if change is greater than 10% and use new value there, in other cases leave null. Then use lag with ignore nulls clause:

select a, b, c, dv, pv, 
       nvl(cpv, lag(cpv, 1, cpv) ignore nulls over (partition by a, b, c order by dv)) new_pv
  from (
    select a, b, c, dv, pv, ppv, case when rn = 1 or abs((ppv - pv)/pv) > .01 then pv end cpv
      from (
        select a, b, c, datevalue dv, pricevalue pv, 
               row_number() over (partition by a, b, c order by datevalue) rn,
               lag(pricevalue) over (partition by a, b, c order by datevalue) ppv
          from table1))

dbfiddle

It should be faster than @LukStorms recursive solution, which is good and works too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM