SQL / Teradata：返回记录，其中连续行中的值相同

Question

I have a data set that looks like: 我有一个数据集，看起来像：

ID        date     emp_num    loc
1111     5/2/16    111111     Brooklyn
1112     5/3/16    222222     Detroit
1113     5/3/16    333333     San Diego
1114     5/2/16    333333     Orlando
1115     5/5/16    333333     Brooklyn
1116     5/7/16    111111     Orlando

In this case, I would want to return records 1113, 1114, and 1115 because the emp_num in consecutive rows (ordered by ID) is the same. 在这种情况下，我要返回记录1113、1114和1115，因为连续行中的emp_num（按ID排序）是相同的。

I use Teradata, but if anyone has a SQL solution for another engine I can usually manage to translate it. 我使用Teradata，但是如果有人对另一个引擎有SQL解决方案，我通常可以设法对其进行翻译。

Thank you. 谢谢。

Answer 1

You need to look at the previous/next row and check if it didn't change: 您需要查看上一行/下一行，并检查它是否保持不变：

SELECT * 
FROM tab
QUALIFY 
   MIN(emp_num) --previous row
   OVER (ORDER BY ID
         ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) = emp_num
OR
   MIN(emp_num) -- next row
   OVER (ORDER BY ID
         ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) = emp_num

In Standard SQL this would be a task for LAG / LEAD , but Teradata doesn't impement it, so you have to rewrite it. 在Standard SQL中，这是LAG / LEAD的任务，但是Teradata不会强制执行，因此您必须重写它。

Answer 2

First, get the rownumber difference ordered by id column and partitioned by emp_num and ordered by id column. 首先，获得按id列排序的行数差异，并按emp_num分区，并按id列排序。 This would classify emp_num into groups. 这会将emp_num分为几类。 Then, get the groups which have more than one member in them (which means there are consecutive rows with the same emp_num value). 然后，获取其中成员多于一个的组（这意味着连续的行具有相同的emp_num值）。 Finally select the required columns for those groups. 最后，为这些组选择所需的列。

WITH x AS (SELECT
  *,
  ROW_NUMBER() OVER (ORDER BY id) - ROW_NUMBER() OVER (PARTITION BY emp_num ORDER BY id) grp
FROM t),
grpsneeded
AS (SELECT
  grp
FROM x
GROUP BY grp
HAVING COUNT(*) > 1)
SELECT
  id,
  dt,
  emp_num
FROM x
WHERE grp IN (SELECT
  grp
FROM grpsneeded)

Sample Demo

This solution works well with SQL Server. 此解决方案可与SQL Server很好地配合使用。

A more simpler SQL solution would be using lead and lag functions. 一个更简单的SQL解决方案将使用lead和lag函数。 As @dnoeth pointed out, Teradata doesn't support these functions. 正如@dnoeth指出的那样，Teradata不支持这些功能。 However, this may be useful for other database engines. 但是，这对于其他数据库引擎可能很有用。

select id, dt , emp_num from (
select *
,lead(emp_num) over(order by id) nxt
,lag(emp_num) over(order by id) prev
from t
) x
where coalesce(nxt,0) = emp_num or coalesce(prev,0) = emp_num

SQL / Teradata：返回记录，其中连续行中的值相同

问题描述

2 个解决方案

解决方案1
2 已采纳 2016-09-01 14:26:05

解决方案2
0 2016-09-01 14:24:04

SQL / Teradata：返回记录，其中连续行中的值相同

问题描述

2 个解决方案

解决方案1 2 已采纳 2016-09-01 14:26:05

解决方案2 0 2016-09-01 14:24:04

解决方案1
2 已采纳 2016-09-01 14:26:05

解决方案2
0 2016-09-01 14:24:04