使用LAG / LEAD分析功能优化自联接Oracle SQL查询？

Question

We have an Oracle SQL query to identify records where the value of a table column has changed from one record to another. 我们有一个Oracle SQL查询来识别表列值从一个记录更改为另一个记录的记录。 Relevant columns are (ID, SOME_COLUMN, FROM_DATE, TO_DATE) where the ID is not unique, and FROM_DATE and TO_DATE determine the time interval for which the the particular row for that ID was effective, ie 相关列是（ID，SOME_COLUMN，FROM_DATE，TO_DATE），其中ID并非唯一，而FROM_DATE和TO_DATE确定该ID的特定行生效的时间间隔，即

(ID1, VAL1, 01/01/2016, 03/01/2016)
(ID1, VAL2, 04/01/2016, 09/01/2016)
(ID1, VAL3, 10/01/2016, 19/01/2016)

etc. 等等

We could implement this using the following self-join 我们可以使用以下自连接来实现

SELECT N.ID
       O.SOME_COLUMN OLD_VALUE,
       N.SOME_COLUMN NEW_VALUE
FROM OUR_TABLE N, OUR_TABLE O
WHERE N.ID = O.ID
  AND N.FROM_DATE - 1 = O.TO_DATE
  AND N.SOME_COLUMN <> O.SOME_COLUMN

however since the table contains 100 millions of records, it quite hits the performance. 但是，由于该表包含1亿条记录，因此性能相当不错。 Is there a more effective way to do this? 有没有更有效的方法可以做到这一点？ Someone hinted analytic functions (eg LAG), but we could not figure out a working solution so far. 有人暗示了分析功能（例如LAG），但到目前为止我们还无法找到可行的解决方案。 Any ideas would be appreciated 任何想法，将不胜感激

Answer 1

Yes, you can use LEAD() to fetch the last value : 是的，您可以使用LEAD()来获取最后一个值：

SELECT t.id,
       t.some_column as OLD_VALUE,
       LEAD(t.some_column) OVER(PARTITION BY t.id ORDER BY t.from_date) as NEW_VALUE
FROM YourTable t

If you want only changes, wrap it with another select and filter OLD_VALUE <> NEW_VALUE 如果您只想更改，请用另一个选择包装它并过滤OLD_VALUE <> NEW_VALUE

Answer 2

If you want the old value and the new value in a single row, then use lag() : 如果要将旧值和新值放在一行中，请使用lag() ：

select t.*,
       lag(some_column) over (partition by id order by from_date) as prev_val
from t;

If the values may not change (as suggested by your sample query): 如果值可能不变（如示例查询所建议）：

select t.*
from (select t.*,
             lag(some_column) over (partition by id order by from_date) as prev_val
      from t
     ) t
where prev_val <> some_column;

Answer 3

I think this is the LAG() approach you were talking about. 我认为这是您正在谈论的LAG（）方法。

SELECT * 
  FROM (
    SELECT ID
           N.SOME_COLUMN NEW_VALUE,
           N.FROM_DATE,
           lag(N.SOME_COLUMN) over (partition by N.ID order by FROM_DATE) OLD_VALUE,
           lag(N.TO_DATE) over (partition by N.ID order by FROM_DATE) OLD_TO_DATE,
    FROM OUR_TABLE N
) T
WHERE FROM_DATE - 1 = OLD_TO_DATE
  AND NEW_VALUE<> OLD_VALUE;

使用LAG / LEAD分析功能优化自联接Oracle SQL查询？

问题描述

3 个解决方案

解决方案1
4 已采纳 2016-09-14 11:18:28

解决方案2
1 2016-09-14 11:19:41

解决方案3
1 2016-09-14 11:20:45

使用LAG / LEAD分析功能优化自联接Oracle SQL查询？

问题描述

3 个解决方案

解决方案1 4 已采纳 2016-09-14 11:18:28

解决方案2 1 2016-09-14 11:19:41

解决方案3 1 2016-09-14 11:20:45

解决方案1
4 已采纳 2016-09-14 11:18:28

解决方案2
1 2016-09-14 11:19:41

解决方案3
1 2016-09-14 11:20:45