简体   繁体   English

使用单个SQL相关子查询来获取两列

[英]Using a single SQL correlated sub-query to get two columns

My problem is represented by the following query: 我的问题由以下查询表示:

SELECT 
  b.row_id, b.x, b.y, b.something,
  (SELECT a.x FROM my_table a WHERE a.row_id = (b.row_id - 1), a.something != 42 ) AS source_x,
  (SELECT a.y FROM my_table a WHERE a.row_id = (b.row_id - 1), a.something != 42 ) AS source_y
FROM 
  my_table b

I'm using the same subquery statement twice, for getting both source_x and source_y . 我使用相同的子查询语句两次,以获取source_xsource_y That's why I'm wondering if it's possible to do it using one subquery only? 这就是为什么我想知道是否可以只使用一个子查询来做到这一点?

Because once I run this query on my real data (millions of rows) it seems to never finish and take hours, if not days (my connection hang up before the end). 因为一旦我对我的真实数据(数百万行)运行此查询,它似乎永远不会完成并花费数小时,如果不是几天(我的连接在结束前挂起)。

I am using PostgreSQL 8.4 我正在使用PostgreSQL 8.4

I think you can use this approach: 我想你可以使用这种方法:

SELECT b.row_id
     , b.x
     , b.y
     , b.something
     , a.x
     , a.y
  FROM my_table b
  left join my_table a on a.row_id = (b.row_id - 1)
                      and a.something != 42

@DavidEG posted the best syntax for the query. @DavidEG发布了查询的最佳语法。

However, your problem is definitely not just with the query technique . 但是,您的问题绝对不仅仅是查询技术 A JOIN instead of two subqueries can speed up things by a factor of two at best. JOIN而不是两个子查询可以将事物加速最多两倍。 Most likely less. 很可能更少。 That doesn't explain "hours". 这并不能解释“小时”。 Even with millions of rows, a decently set up Postgres should finish the simple query in seconds, not hours. 即使有数百万行,一个体面设置的Postgres也应该在几秒钟内完成简单的查询,而不是几小时。

  • First thing that stands out is the syntax error in your query: 首先要突出的是查询中的语法错误

     ... WHERE a.row_id = (b.row_id - 1), a.something != 42 

    AND or OR is needed here, not a comma. 这里需要ANDOR ,而不是逗号。

  • Next thing to check are indexes . 接下来要检查的是索引 If row_id is not the primary key, you may not have an index on it. 如果row_id不是主键,则可能没有索引。 For optimum performance of this particular query create a multi-column index on (row_id, something) like this: 为了获得此特定查询的最佳性能,请在(row_id, something)上创建一个多列索引 (row_id, something)如下所示:

     CREATE INDEX my_table_row_id_something_idx ON my_table (row_id, something) 
  • If the filter excludes the same value every time in something != 42 you can also use a partial index instead for additional speed up: 如果过滤器每次在something != 42排除相同的值 something != 42您也可以使用部分索引代替额外的加速:

     CREATE INDEX my_table_row_id_something_idx ON my_table (row_id) WHERE something != 42 

    This will only make a substantial difference if 42 is a common value or something is a bigger column than just an integer. 如果42是一个公共值, 或者 something是一个比一个整数更大的列,那么这只会产生实质性的差异。 (An index with two integer columns normally occupies the the same size on disk as an index with just one, due to data alignment. See: (由于数据对齐,具有两个integer列的索引通常在磁盘上占用与仅有一个索引的索引相同的大小。请参阅:

  • When performance is an issue, it is always a good idea to check your settings . 当性能出现问题时,最好检查一下您的设置 Standard settings in Postgres use minimal resources in many distributions and are not up to handling "millions of rows". Postgres中的标准设置在许多发行版中使用最少的资源,并且不能处理“数百万行”。

  • Depending on your actual version of Postgres, an upgrade to a current version (9.1 at the time of writing) may help a lot. 根据您的Postgres的实际版本, 升级到当前版本 (撰写本文时为9.1)可能会有很大帮助。

  • Ultimately, hardware is always a factor, too. 最终, 硬件也是一个因素。 Tuning and optimizing can only get you so far. 调整和优化只能让你到目前为止。

old-fashioned syntax: 老式语法:

SELECT 
  b.row_id, b.x, b.y, b.something
  , a.x AS source_x
  , a.y AS source
FROM my_table b
    ,my_table a 
WHERE a.row_id = b.row_id - 1
  AND a.something != 42
  ;

Join-syntax: 加入语法:

SELECT 
  b.row_id, b.x, b.y, b.something
  , a.x AS source_x
  , a.y AS source
FROM my_table b
JOIN my_table a 
  ON (a.row_id = b.row_id - 1)
WHERE a.something != 42
  ;
SELECT b.row_id, b.x, b.y, b.something, a.x, a.y
  FROM my_table b
  LEFT JOIN (
    SELECT row_id + 1, x, y
      FROM my_table
      WHERE something != 42
  ) AS a ON a.row_id = b.row_id;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM