简体   繁体   English

Oracle:如何查找行中的重叠

[英]Oracle: How to find overlaps in rows

Suppose I have the following table: 假设我有下表:

User_ID Activity_ID
123     222
123     333
124     222
124     224
124     333
125     224
125     333

I want to return a count users by the different combinations of overlaps such as the following: 我想通过重叠的不同组合来返回一个计数用户,例如:

Activity_ID_1 Activity_ID_2 Count_of_Users
222           333           2 
222           224           2

In the above example, there are 2 users who completed both 223 AND 333. 在上面的示例中,有2位用户同时完成223和333。

I do not want to define each combination manually since there are 93 different activity_ids I am working with. 我不想手动定义每个组合,因为我正在使用93个不同的activity_id。 Is there a way to do this purely in Oracle SQL? 有没有一种方法可以完全在Oracle SQL中做到这一点?

Assuming you have an activity table with activity id's, and you want to count only DISTINCT users who had the same two activities (the same user having both activities twice wouldn't count): 假设您有一个带有活动ID的activity表,并且您只想统计具有两个相同活动的DISTINCT用户(同一用户同时具有两个活动两次将不计算在内):

select a1.activity_id, a2.activity_id, count(distinct f.user_id)
from   activity a1 inner join facts    f  on a1.activity_id = f.activity_id
                   inner join activity a2 on a2.activity_id = f.activity_id
where  a1.activity_id < a2.activity_id
group by a1.activity_id, a2.activity_id
having count(distinct f.user_id) >= 2
;

facts is the name of your facts table (the one you show in your question). facts是事实表的名称(在问题中显示的事实表)。

EDIT: If the facts table (or view or subquery or whatever) is already "distinct"-ed by user_id, then delete "distinct" from my solution; 编辑:如果facts表(或视图或子查询或任何东西)已经被user_id“区别”了,那么从我的解决方案中删除“ distinct”; this will make it more efficient. 这将使其更有效率。 NOTE: "distinct" appears twice, once in SELECT and again in HAVING. 注意:“ distinct”出现两次,一次出现在SELECT中,另一次出现在HAVING中。

Oracle Setup : Oracle安装程序

CREATE TABLE data ( User_ID, Activity_ID ) AS
SELECT 123, 222 FROM DUAL UNION ALL
SELECT 123, 333 FROM DUAL UNION ALL
SELECT 124, 222 FROM DUAL UNION ALL
SELECT 124, 224 FROM DUAL UNION ALL
SELECT 124, 333 FROM DUAL UNION ALL
SELECT 125, 224 FROM DUAL UNION ALL
SELECT 125, 333 FROM DUAL;

CREATE TYPE INTLIST AS TABLE OF INT;
/

Query : 查询

WITH Activities ( User_IDs, Activity_ID ) AS (
  SELECT CAST( COLLECT( User_ID ) AS INTLIST ),
         Activity_ID
  FROM   data
  GROUP BY Activity_ID
)
SELECT a.Activity_ID,
       b.Activity_ID,
       CARDINALITY( a.User_IDs MULTISET INTERSECT b.User_IDs ) AS "Count"
FROM   Activities a
       INNER JOIN
       Activities b
       ON ( CARDINALITY( a.User_IDs MULTISET INTERSECT b.User_IDs ) > 1
           AND a.Activity_ID < b.Activity_ID );

Output : 输出

ACTIVITY_ID ACTIVITY_ID      Count
----------- ----------- ----------
        222         333          2 
        224         333          2 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM