简体   繁体   English

计算 MySQL 中实体在不同日期和地点的共现次数并排序

[英]Count and rank-order co-occurrences of entities across dates and locations in MySQL

I need to build a SQL script to count how many times pairs of different entities appeared in the same DATE and LOCATION.我需要构建一个 SQL 脚本来计算不同实体对在同一日期和位置出现的次数。 On any given date, there will be multiple locations and many entity IDs.在任何给定日期,都会有多个位置和许多实体 ID。 I need to find out how often PAIRS of entities were at the same location on the same date, and count the number of co-occurrences.我需要找出成对实体在同一日期出现在同一地点的频率,并计算同时出现的次数。 In reality, I'm going to have many hundreds of distinct entities across 12 months of dates and 20+ locations.实际上,我将在 12 个月的日期和 20 多个地点拥有数百个不同的实体。

Entity实体 Date日期 Location地点
A一种 1-1-23 1-1-23 Loc 1地点 1
B 1-1-23 1-1-23 Loc 1地点 1
C C 1-1-23 1-1-23 Loc 1地点 1
D 1-1-23 1-1-23 Loc 1地点 1
E 1-1-23 1-1-23 Loc 1地点 1
F F 1-1-23 1-1-23 Loc 1地点 1
A一种 1-2-23 1-2-23 Loc 2位置 2
B 1-2-23 1-2-23 Loc 2位置 2
D 1-2-23 1-2-23 Loc 2位置 2
C C 1-2-23 1-2-23 Loc 3位置 3
F F 1-2-23 1-2-23 Loc 3位置 3
B 1-3-23 1-3-23 Loc 2位置 2
A一种 1-4-23 1-4-23 Loc 1地点 1
F F 1-4-23 1-4-23 Loc 1地点 1
A一种 1-5-23 1-5-23 Loc 2位置 2
C C 1-5-23 1-5-23 Loc 2位置 2
D 1-5-23 1-5-23 Loc 2位置 2
E 1-5-23 1-5-23 Loc 3位置 3

I want to count how many times entity A appeared with entity B on the same date and location.我想统计实体 A 在同一日期和地点与实体 B 一起出现了多少次。 The results would look like this (Note - eventually I'll order by Count(desc) but this result lets you see the factorial combinations first):结果看起来像这样(注意——最终我会按 Count(desc) 排序,但这个结果让你首先看到阶乘组合):

Entity1实体1 Entity2实体2 Count数数
A一种 B 2 2个
A一种 C C 2 2个
A一种 D 3 3个
A一种 E 1 1个
A一种 F F 2 2个
B C C 1 1个
B D 2 2个
B E 1 1个
B F F 1 1个
C C D 2 2个
C C E 1 1个
C C F F 2 2个
D E 1 1个
D F F 1 1个
E F F 1 1个

I'm at a bit of a loss on how to do this.我对如何做到这一点有点不知所措。 My first thought was to:我的第一个想法是:

SELECT t1.Entity as Entity1, t2.Entity as Entity2, COUNT(*) as Count
FROM (
SELECT Entity, CONCAT(Date, Location) AS ConcatenatedValue, COUNT(*) 
FROM occurrences 
WHERE Year(Date) = 2022) t1,
(SELECT Entity, CONCAT(Date, Location) AS ConcatenatedValue, COUNT(*)
FROM occurrences
WHERE Year(Date) = 2022) t2
WHERE t1.ConcatenatedValue = t2.ConcatenatedValue
GROUP BY Entity1, Entity2
ORDER BY Count

Clearly that doesn't do what I need.显然,这不能满足我的需要。 HELP.帮助。 My head is spinning.我的头在旋转。

You can address this problem with a self join on your table, with the following conditions:您可以在满足以下条件的情况下通过表上的自联接来解决此问题:

  • date must match日期必须匹配
  • location must match位置必须匹配
  • 1st table entity is smaller than 2nd table entity第一个表实体小于第二个表实体

Then you can apply aggregation directly.然后你可以直接应用聚合。

SELECT t1.Entity      AS entity1,
       t2.Entity      AS entity2,
       COUNT(t1.Date) AS cnt
FROM       tab t1
INNER JOIN tab t2
        ON t1.Date = t2.Date 
       AND t1.Location = t2.Location 
       AND t1.Entity < t2.Entity
GROUP BY entity1, entity2
ORDER BY entity1, entity2

Check the demo here .此处查看演示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM