PostgreSQL-为条件成立的行选择count（*）

Question

我有下表和一些示例记录：

  id  | attr1_id | attr2_id |      user_id      | rating_id | override_comment
------+----------+----------+-------------------+-----------+------------------
 1    |      188 |      201 | user_1@domain.com |         3 |
 2    |      193 |      201 | user_2@domain.com |         2 |
 3    |      193 |      201 | user_2@domain.com |         1 |
 4    |      194 |      201 | user_2@domain.com |         1 |
 5    |      194 |      201 | user_1@domain.com |         1 |
 6    |      192 |      201 | user_2@domain.com |         1 |

（ attr1_id ， attr2_id ， user_id ）的组合是UNIQUE ，这意味着每个用户只能创建具有一对特定属性ID的记录。

我的目标是计算rating_id = 1的行数，但仅计算attr1_id和attr2_id每个组合一次，并且仅当没有其他行（由其他用户）具有rating_id > 1并引用时相同的attr1_id和attr2_id 。 请注意，可以切换attr1_id和attr2_id的组合，因此请给出以下两个记录：

  id  | attr1_id | attr2_id |      user_id       | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
  20  |       5  |       2  | user_1@domain.com  |         3 |
------+----------+----------+--------------------+-----------+------------------
  21  |       2  |       5  | user_2@domain.com  |         1 |

不应计算任何行，因为这些行指的是attr_ids的相同组合，并且其中之一具有rating_id > 1 。

但是，如果存在这两行：

  id  | attr1_id | attr2_id |      user_id       | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
  20  |       5  |       2  | user_1@domain.com  |         1 |
------+----------+----------+--------------------+-----------+------------------
  21  |       2  |       5  | user_2@domain.com  |         1 |
------+----------+----------+--------------------+-----------+------------------
  22  |       2  |       5  | user_3@domain.com  |         1 |

所有行仅应计为1，因为它们都共享attr1_id和attr2_id的相同组合，并且所有行的rating_id = 1 。

到目前为止，我的方法是这样，但是它根本没有选择任何行。

SELECT *
FROM compatibility c
WHERE rating_id > 1
  AND NOT EXISTs
    (SELECT *
     FROM compatibility c2
     WHERE c.rating_id > 1
       AND (
             (c.attr1_id = c2.attr1_id) AND (c.attr2_id = c2.attr2_id)
             OR
             (c.attr1_id = c2.attr2_id) AND (c.attr2_id = c2.attr1_id)
           )
    )

我该如何实现？

Answer 1

我的目标是计算rating_id = 1的行数，但仅计算attr1_id和attr2_id的每个组合一次，并且仅当没有其他行（由其他用户）具有rating_id> 1的行时

建立在您的原始作品上

您的原始查询在正确的轨道上，可以排除违规的行。 您只有>而不是= 。 计数的棘手步骤丢失了。

SELECT count(*) AS ct
FROM  (
   SELECT 1
   FROM   compatibility c
   WHERE  rating_id = 1
   AND    NOT EXISTS (
      SELECT 1
      FROM   compatibility c2
      WHERE  c2.rating_id > 1
      AND   (c2.attr1_id = c.attr1_id AND c2.attr2_id = c.attr2_id OR
             c2.attr1_id = c.attr2_id AND c2.attr2_id = c.attr1_id))
   GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
   ) sub;

短

可能也更快。

SELECT count(*) AS ct
FROM  (
   SELECT 1  -- selecting more columns for count only would be a waste
   FROM   compatibility
   GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
   HAVING every(rating_id = 1)
   ) sub;

类似于@Clodoaldo的查询或此早期的答案，其中有更多说明。
every(rating_id = 1)比not bool_or(rating_id > 1)更简单，但也排除了rating < 1 -这可能对您的情况很好（甚至更好）。

MySQL当前不实现（标准SQL！） every() 。 由于您只想消除rating_id > 1 ，因此此简单表达式更符合您的要求，并且可以在两个RDBMS中使用：

HAVING max(rating_id) = 1

最短

使用count(*)作为窗口聚合函数，并且没有子查询。

SELECT count(*) OVER () AS ct
FROM   compatibility
GROUP  BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
HAVING max(rating_id) = 1
LIMIT  1;

在聚合步骤之后应用窗口功能。 在此基础上，我们在单个查询级别完成了两个汇总步骤：

等效(atr1_id, atr2_id) ，不包括存在不同rating_id行。
使用窗口函数对剩余的全部行进行计数。

LIMIT 1以获得单行（所有行都是相同的）。
MySQL没有窗口功能。 仅Postgres 。
最短，不一定最快。

SQL提琴。 _{（在pg9.2上，因为pg9.3当前处于离线状态。）}

Answer 2

如果我理解正确，那么您想要的属性对始终为“ 1”。

这应该给您属性：

select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
       min(rating_id) as minri, max(rating_id) as maxri
from compatibility c
group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
having min(rating_id) = 1 and max(rating_id) = 1;

要获得计数，只需将其用作子查询：

select count(*)
from (select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
             min(rating_id) as minri, max(rating_id) as maxri
      from compatibility c
      group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
      having min(rating_id) = 1 and max(rating_id) = 1
     ) c

Answer 3

在Postgresql中做到这一点。 SQLFiddle也不能立即工作：

select count(*)
from (
    select least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
    from compatibility
    group by 1, 2
    having not bool_or(rating_id > 1)
) s
;
 count 
-------
     2
(1 row)

Answer 4

我将使用CASE .. WHEN来重新排列属性，以使较小的属性始终是第一个，并且其顺序始终如此。 以下示例查询。

SELECT attrSmall, 
       attrLarge,            
       MAX(rating_id) as ratingMax
  FROM (
   SELECT CASE WHEN c.attr1_id < c.attr2_id 
               THEN c.attr1_id 
               ELSE c.attr2_id END as attrSmall,
          CASE WHEN c.attr1_id < c.attr2_id 
               THEN c.attr2_id 
               ELSE c.attr1_id END as attrLarge,
          c.rating_id
    FROM compatibility c) as c1
  GROUP BY atrrSmall, attrLarge
  HAVING ratingMax = 1

PostgreSQL-为条件成立的行选择count（*）

问题描述

4 个解决方案

解决方案1
2 2014-11-03 23:38:11

建立在您的原始作品上

短

最短

解决方案2
1 已采纳 2014-11-03 17:11:40

解决方案3
1 2014-11-03 18:33:27

解决方案4
0 2014-11-03 17:11:09

PostgreSQL-为条件成立的行选择count（*）

问题描述

4 个解决方案

解决方案1 2 2014-11-03 23:38:11

建立在您的原始作品上

短

最短

解决方案2 1 已采纳 2014-11-03 17:11:40

解决方案3 1 2014-11-03 18:33:27

解决方案4 0 2014-11-03 17:11:09

解决方案1
2 2014-11-03 23:38:11

解决方案2
1 已采纳 2014-11-03 17:11:40

解决方案3
1 2014-11-03 18:33:27

解决方案4
0 2014-11-03 17:11:09