繁体   English   中英

PostgreSQL-为条件成立的行选择count(*)

[英]PostgreSQL - select count(*) for rows where a condition holds

我有下表和一些示例记录:

  id  | attr1_id | attr2_id |      user_id      | rating_id | override_comment
------+----------+----------+-------------------+-----------+------------------
 1    |      188 |      201 | user_1@domain.com |         3 |
 2    |      193 |      201 | user_2@domain.com |         2 |
 3    |      193 |      201 | user_2@domain.com |         1 |
 4    |      194 |      201 | user_2@domain.com |         1 |
 5    |      194 |      201 | user_1@domain.com |         1 |
 6    |      192 |      201 | user_2@domain.com |         1 |

attr1_idattr2_iduser_id )的组合是UNIQUE ,这意味着每个用户只能创建具有一对特定属性ID的记录。

我的目标是计算rating_id = 1的行数,但仅计算attr1_idattr2_id每个组合一次,并且仅当没有其他行(由其他用户)具有rating_id > 1并引用时相同的attr1_idattr2_id 请注意,可以切换attr1_idattr2_id的组合,因此请给出以下两个记录:

  id  | attr1_id | attr2_id |      user_id       | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
  20  |       5  |       2  | user_1@domain.com  |         3 |
------+----------+----------+--------------------+-----------+------------------
  21  |       2  |       5  | user_2@domain.com  |         1 |

不应计算任何行,因为这些行指的是attr_ids的相同组合,并且其中之一具有rating_id > 1

但是,如果存在这两行:

  id  | attr1_id | attr2_id |      user_id       | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
  20  |       5  |       2  | user_1@domain.com  |         1 |
------+----------+----------+--------------------+-----------+------------------
  21  |       2  |       5  | user_2@domain.com  |         1 |
------+----------+----------+--------------------+-----------+------------------
  22  |       2  |       5  | user_3@domain.com  |         1 |

所有行仅应计为1,因为它们都共享attr1_idattr2_id的相同组合,并且所有行的rating_id = 1

到目前为止,我的方法是这样,但是它根本没有选择任何行。

SELECT *
FROM compatibility c
WHERE rating_id > 1
  AND NOT EXISTs
    (SELECT *
     FROM compatibility c2
     WHERE c.rating_id > 1
       AND (
             (c.attr1_id = c2.attr1_id) AND (c.attr2_id = c2.attr2_id)
             OR
             (c.attr1_id = c2.attr2_id) AND (c.attr2_id = c2.attr1_id)
           )
    )

我该如何实现?

我的目标是计算rating_id = 1的行数,但仅计算attr1_id和attr2_id的每个组合一次,并且仅当没有其他行(由其他用户)具有rating_id> 1的行时

建立在您的原始作品上

您的原始查询在正确的轨道上,可以排除违规的行。 您只有>而不是= 计数的棘手步骤丢失了。

SELECT count(*) AS ct
FROM  (
   SELECT 1
   FROM   compatibility c
   WHERE  rating_id = 1
   AND    NOT EXISTS (
      SELECT 1
      FROM   compatibility c2
      WHERE  c2.rating_id > 1
      AND   (c2.attr1_id = c.attr1_id AND c2.attr2_id = c.attr2_id OR
             c2.attr1_id = c.attr2_id AND c2.attr2_id = c.attr1_id))
   GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
   ) sub;

可能也更快。

SELECT count(*) AS ct
FROM  (
   SELECT 1  -- selecting more columns for count only would be a waste
   FROM   compatibility
   GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
   HAVING every(rating_id = 1)
   ) sub;

类似于@Clodoaldo的查询或此早期的答案,其中有更多说明
every(rating_id = 1)not bool_or(rating_id > 1)更简单,但也排除了rating < 1 -这可能对您的情况很好(甚至更好)。

MySQL当前不实现(标准SQL!) every() 由于您只想消除rating_id > 1 ,因此此简单表达式更符合您的要求,并且可以在两个RDBMS中使用:

HAVING max(rating_id) = 1

最短

使用count(*)作为窗口聚合函数,并且没有子查询。

SELECT count(*) OVER () AS ct
FROM   compatibility
GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
HAVING max(rating_id) = 1
LIMIT  1;

在聚合步骤之后应用窗口功能。 在此基础上,我们在单个查询级别完成了两个汇总步骤:

  1. 等效(atr1_id, atr2_id) ,不包括存在不同rating_id行。
  2. 使用窗口函数对剩余的全部行进行计数。

LIMIT 1以获得单行(所有行都是相同的)。
MySQL没有窗口功能。 Postgres
最短,不一定最快。

SQL提琴。 (在pg9.2上,因为pg9.3当前处于离线状态。)

如果我理解正确,那么您想要的属性对始终为“ 1”。

这应该给您属性:

select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
       min(rating_id) as minri, max(rating_id) as maxri
from compatibility c
group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
having min(rating_id) = 1 and max(rating_id) = 1;

要获得计数,只需将其用作子查询:

select count(*)
from (select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
             min(rating_id) as minri, max(rating_id) as maxri
      from compatibility c
      group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
      having min(rating_id) = 1 and max(rating_id) = 1
     ) c

在Postgresql中做到这一点。 SQLFiddle也不能立即工作:

select count(*)
from (
    select least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
    from compatibility
    group by 1, 2
    having not bool_or(rating_id > 1)
) s
;
 count 
-------
     2
(1 row)

我将使用CASE .. WHEN来重新排列属性,以使较小的属性始终是第一个,并且其顺序始终如此。 以下示例查询。

SELECT attrSmall, 
       attrLarge,            
       MAX(rating_id) as ratingMax
  FROM (
   SELECT CASE WHEN c.attr1_id < c.attr2_id 
               THEN c.attr1_id 
               ELSE c.attr2_id END as attrSmall,
          CASE WHEN c.attr1_id < c.attr2_id 
               THEN c.attr2_id 
               ELSE c.attr1_id END as attrLarge,
          c.rating_id
    FROM compatibility c) as c1
  GROUP BY atrrSmall, attrLarge
  HAVING ratingMax = 1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM