如何使用SQL计算列中的非连续值的数量？

Question

Following-up on my question here . 在这里跟进我的问题。 Say I have a table in an Oracle database like the one below (table_1) which tracks service involvement for a particular individual: 假设我在Oracle数据库中有一个表，如下面的表（table_1），它跟踪特定个人的服务参与：

name  day  srvc_ inv
bill  1  1
bill  2  1
bill  3  0
bill  4  0
bill  5  1
bill  6  0
susy  1  1
susy  2  0
susy  3  1
susy  4  0
susy  5  1

My goal is to get a summary table which lists, for all unique individuals, whether there was service involvement and the number of distinct service episodes (in this case 2 for bill and 3 for susy), where a distinct service episode is identified by a break in activity over days. 我的目标是获得一个汇总表，列出所有独特的个人，是否有服务参与和不同服务事件的数量（在这种情况下2为票据，3为susy），其中一个不同的服务事件由一个识别在几天内打破活动。

To get any service involvement, I would use the following query 为了获得任何服务，我将使用以下查询

SELECT table_1."Name", MAX(table_1."Name") AS "any_invl"
FROM table_1
GROUP BY table_1."Name"

However, I'm stuck as to how I would get the number of service involvements (2). 但是，我不知道如何获得服务涉及的数量（2）。 Using a static dataframe in R, you would use run length encoding (see my original question), but I don't know how I could accomplish this in SQL. 在R中使用静态数据帧，您将使用运行长度编码（请参阅我原来的问题），但我不知道如何在SQL中完成此操作。 This operation would be run over a large number of records so it would be impractical to store the entire data frame as an object and then run it in R. 此操作将在大量记录上运行，因此将整个数据帧存储为对象并在R中运行它是不切实际的。

Edit: My expect output would be as follows: 编辑：我的期望输出如下：

name  any_invl  n_srvc_inv
bill  1  2
susy  1  3

Thanks for any help! 谢谢你的帮助！

Answer 1

Something like this? 像这样的东西？

SQL> with test (name, day, srvc_inv) as
  2    (select 'bill', 1, 1 from dual union all
  3     select 'bill', 2, 1 from dual union all
  4     select 'bill', 3, 0 from dual union all
  5     select 'bill', 4, 0 from dual union all
  6     select 'bill', 5, 1 from dual union all
  7     select 'bill', 6, 0 from dual union all
  8     select 'susy', 1, 1 from dual union all
  9     select 'susy', 2, 0 from dual union all
 10     select 'susy', 3, 1 from dual union all
 11     select 'susy', 4, 0 from dual union all
 12     select 'susy', 5, 1 from dual
 13    ),
 14  inter as
 15    (select name, day, srvc_inv,
 16       nvl(lead(srvc_inv) over (partition by name order by day), 0) lsrvc
 17     from test
 18    )
 19  select name,
 20    sum(case when srvc_inv <> lsrvc and lsrvc = 0 then 1
 21             else 0
 22        end) grp
 23  from inter
 24  group by name;

NAME        GRP
---- ----------
bill          2
susy          3

SQL>

Answer 2

I would suggest using lag() . 我建议使用lag() 。 The idea is to count a "1", but only when the preceding value is zero or null : 这个想法是计算一个“1”，但只有当前面的值为零或为null ：

select name, count(*)
from (select t.*,
             lag(srvc_inv) over (partition by name order by day) as prev_srvc_inv
      from t
     ) t
where (prev_srvc_inv is null or prev_srvc_inv = 0) and
      srvc_inv = 1
group by name;

You can simplify this a little by using a default value for lag() : 您可以使用lag()的默认值来简化此操作：

select name, count(*)
from (select t.*,
             lag(srvc_inv, 1, 0) over (partition by name order by day) as prev_srvc_inv
      from t
     ) t
where prev_srvc_inv = 0 and srvc_inv = 1
group by name;

Answer 3

You can try below query, having LAG function to handle the change in srvc_invl 您可以尝试以下查询，具有LAG功能来处理srvc_invl中的更改

select name, 1 any_invl, count(case when diff = 1 then 1 end) n_srvc_inv
from (select name, day, srvc_inv - LAG(srvc_inv, 1, 0) OVER(ORDER BY name, day) diff
      from tab
      order by name, day) temp
group by name

Here is the fiddle for your reference. 这是小提琴，供您参考。

如何使用SQL计算列中的非连续值的数量？

问题描述

3 个解决方案

解决方案1
3 2019-08-31 19:08:25

解决方案2
2 已采纳 2019-08-31 19:34:17

解决方案3
1 2019-08-31 19:21:42

如何使用SQL计算列中的非连续值的数量？

问题描述

3 个解决方案

解决方案1 3 2019-08-31 19:08:25

解决方案2 2 已采纳 2019-08-31 19:34:17

解决方案3 1 2019-08-31 19:21:42

解决方案1
3 2019-08-31 19:08:25

解决方案2
2 已采纳 2019-08-31 19:34:17

解决方案3
1 2019-08-31 19:21:42