简体   繁体   English

版本2:如何编写postgres sql查询,返回M +某个列的不同值,返回M +未知的总记录?

[英]Version 2: How can I write a postgres sql query that returns M distinct values of a certain column with M + unknown overall records returned?

This is a slightly harder version of what I posted earlier. 这是我之前发布的稍微更难的版本。 I don't want to edit the original to ruin a perfectly good answer to the question I had at the time. 我不想编辑原文来破坏我当时的问题的完美答案。

Suppose a table named tau. 假设一个名为tau的表。

tau:
A | B | C
----------    
2 - 1 - red
3 - 1 - rod
4 - 1 - rope
6 - 5 - red
7 - 5 - rap
8 - 5 - rod
9 - 10 -rod
11- 10 -road
12- 13 -rud

Column A is the primary key. A列是主键。 It will be unique. 它将是独一无二的。 Column B is a foreign key. B列是外键。 In my database no integer key is ever the same across tables. 在我的数据库中,没有整数键在表之间是相同的。 Column C is not a key. C列不是关键。

This table will have a lot of rows. 这个表会有很多行。 There are many other columns, say column C for instance, that are indexed for search. 还有许多其他列,例如列C,它们是为搜索索引的。

I want exactly M distinct values from column B. Lets say M = 2 我想从B列得到M个不同的值。让我们说M = 2

Importantly, I am using column B in the expression (it's indexed too!) to determine which B's to return. 重要的是,我在表达式中使用了列B(它也被索引了!)来确定返回哪个B.

Normally, I would go 通常,我会去

select distinct B From tau Where C like 'r_d' AND B < 13 Order By B Desc Limit 2

and I get 我明白了

B
-----
10
5

This is the current state of affairs, now. 这就是现在的现状。 But I want to upgrade to a new scenario: 但我想升级到一个新的场景:

When the expression on BC is satisfied and adds the record to the results pile (C like 'r_d' AND B < 13), I want to return column A as well as column B, while maintaining the restriction of 2 at most distinct column B values. 当满足BC上的表达式并将记录添加到结果堆(C如'r_d'和B <13)时,我想返回A列和B列, 同时保持2的限制最多不同的B列值。

Another important condition is the solution must work for expressions in the where clause that return true for possibly different values of C. 另一个重要条件是解决方案必须适用于where子句中的表达式,该表达式对于可能不同的C值返回true。

Behold, the results I want. 看哪,我想要的结果。

A | B
------
9 - 10
8 - 5
6 - 5

The trouble is, the actual limit of records returned can be greater than M, or 2 in the example. 麻烦的是,返回的记录的实际限制可能大于M,或者示例中为2。 I don't really care how many records come through, as long as there are only M distinct values of B. How would I write a query to accomplish this in Postgresql? 我真的不关心有多少记录通过,只要只有M个不同的B值。如何在Postgresql中编写查询来完成此操作?

DENSE_RANK should do what you need. DENSE_RANK应该做你需要的。 Here is SQL Fiddle . 这是SQL小提琴 Put your M into WHERE R.rnk <= 2 instead of 2 . 将你的M放入WHERE R.rnk <= 2而不是2

Rather than starting with M distinct values from B and trying to figure out how to add missing values from column A to the final result, I'm thinking about this problem from the other side. 而不是从B开始M不同的值并试图弄清楚如何将A列中A缺失值添加到最终结果中,我正在考虑另一方面的这个问题。 We have some search conditions that limit the whole table to some intermediary result set: 我们有一些搜索条件将整个表限制为某些中间结果集:

SELECT *
FROM tau
WHERE C LIKE 'r_d' AND B < 13

This intermediary result set has all columns from the table, nothing is aggregated yet. 此中间结果集包含表中的所有列,但尚未汇总任何内容。 We just need to filter this set further and leave only M distinct values of the B column. 我们只需要进一步过滤这个集合,只留下B列的M不同值。 DENSE_RANK assigns a number (without gaps) to each group of B values, so we can use it in the final filter. DENSE_RANK为每组B值分配一个数字(无间隙),因此我们可以在最终过滤器中使用它。

SELECT
    A, B
FROM
    (
        SELECT
            A
            ,B
            ,C
            ,DENSE_RANK() OVER(ORDER BY B DESC) AS rnk
        FROM tau
        WHERE C LIKE 'r_d' AND B < 13
    ) AS R
WHERE R.rnk <= 2
ORDER BY B DESC, A DESC;

result set 结果集

A    B
9    10
8    5
6    5

Second variant using LATERAL JOIN 使用LATERAL JOIN的第二个变体

SQL Fiddle with both variants. SQL小提琴兼有两种变体。 In the second variant we are finding M distinct values of B at first. 在第二个变体中,我们首先找到M不同的B值。 Then for each found value we run the main query with extra filter for this B value. 然后,对于每个找到的值,我们使用针对此B值的额外过滤器运行主查询。 If there are millions of different values of B , then filtering to specific value should be efficient, provided there is an index on B . 如果现在有成千上万的不同值的B ,然后过滤到特定的值应该是有效的,只要是在索引B If there are millions of rows, but number of total distinct values of B is small, it should be slow. 如果有数百万行,但B的总不同值的数量很小,则应该很慢。

WITH
CTE
AS
(
    select distinct B 
    From tau 
    Where C like 'r_d' AND B < 13 
    Order By B Desc 
    Limit 2
)
SELECT
  T.A, T.B, T.C
FROM
  CTE
  INNER JOIN LATERAL
  (
      SELECT tau.A, tau.B, tau.C
      FROM tau
      WHERE tau.B = CTE.B AND tau.C like 'r_d' AND tau.B < 13 
  ) AS T ON true
ORDER BY T.B DESC, T.A DESC;

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Postgresql中,如何从一个表中具有M个不同值的表中获取未知数量的记录? - In Postgresql, how can I fetch an unknown # of records from a table that have M distinct values in one column? 在 Postgres 中,我如何将 SQL 查询写入 select 个不同的值,但在设定的时间段内聚合 - In Postgres how do I write a SQL query to select distinct values overall but aggregated over a set time period 如何编写一个查询,该查询为一个列中的每个不同值返回一行,并从另一列中返回一个任意值? - How can I write a query that returns a row for every distinct value in one column and returns an arbitrary value from another column? 如何使用 Power Query M 将 Excel 表列值提取到 SQL WHERE 子句中? - How do I pull Excel table column values into a SQL WHERE clause using Power Query M? 我正在尝试编写一个查询,该查询将检查策略是否具有0个过期记录 - I'm trying to write a query that will check if a policy has a 0 dated records or not 如何在Postgres SQL中为列定义某些值? - How to define certain values for column in Postgres sql? 在 SQL 中,如果返回的字符串位于起始字母 &#39;a&#39; 到 &#39;m&#39; 之间,您将如何返回该列的字符串值? - In SQL, how would you return the String values of a column where the Strings returned are between the starting letters 'a' to 'm'? 我有一列里面有 7 个不同的值。 我有兴趣只选择这 7 个值中的一个,我该如何查询呢? - I have a column with 7 different values inside. I'm interested in just selecting one of those 7 values, how can I query this? SQL 查询从每列的分组记录中获取不同的值 - SQL query to get distinct values from grouped records of each column 如何通过查询识别SQL Server中的1-1,1-M和MN关系? - How can I recognize 1-1,1-M and M-N relationship in SQL Server by query?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM