简体   繁体   中英

Finding count in two unrelated tables in a single query

The following query is used to count the number of rows in two unrelated tables in a single query.

  With t1 as (Select 1 
              Union Select  2 
              Union Select 3),
  t2 as (Select 'A' 
         Union Select 'B')

  Select (Select count(*) from t1), (Select count(*) from t2)

Is there a better way to avoid the two select statements in the select query.

The output should be

3 2

Any construct that is specific to Postgres will also do.

Simple and correct

First of all, you can simplify your test case with a VALUES expression instead of the more verbose UNION ALL SELECT .
You'd need explicit type casts in the first row, if data types are not the default integer and text ..

Second, a FULL OUTER JOIN is utterly pointless. All it does is make your query slower. And if any row has more than one match in the other table, it gets multiplied in the count.

WITH t1(col1, col2) AS (VALUES (1, 1),   (2, 2),   (3, 3))
    ,t2(col1, col2) AS (VALUES (1, 'A'), (2, 'B'), (2, 'C'))  -- 2nd row for "2"
SELECT count(t1.*), count(t2.*)
FROM t1
FULL OUTER JOIN t2 USING (col1);

Yields:

4   3

which is wrong .

WITH t1(col2) AS (VALUES (1),   (2),   ( 3))
    ,t2(col2) AS (VALUES ('A'), ('B'), ('C'))
SELECT (SELECT count(*) FROM t1) AS t1_ct
      ,(SELECT count(*) FROM t1) AS t2_ct;

Yields:

3   3

which is correct , besides being simpler and faster.
Admittedly, with row_number() freshly applied, there can be no dupes. But it's just a waste of time.

Performance

Counting is relatively slow for big tables. If you don't need an exact count but can live with an estimate, you can get this extremely fast:

SELECT reltuples::bigint AS estimate
FROM   pg_class
WHERE  oid = 'myschema.mytable'::regclass;

I quote the manual here :

It is updated by VACUUM, ANALYZE, and a few DDL commands such as CREATE INDEX.

More details in this related answer.

Counting is an extremely expensive operation (in terms of CPU load). Try to avoid whenever possible. If you need to get the total number of rows of a table without any condition, some RDBMS offer a workaround, eg with MSSQL, it's like this:

select SUM(row_count) as Total_Rows
 from sys.dm_db_partition_stats
where object_name(object_id) = 'YourTableName' 
  and index_id < 2

An alternative could be to maintain your count in a separate table, eg if you need the total number grouped by a certain value. You would increase and decrease the counts using a trigger. This is recommended if you (eg) have to show a count on the main form all the time (active users, active posts per area etc.

Introduce relationship between the two tables by adding a row_number column and do a full outer join.

  With t1 as (Select 1 as Col1, 1 
              Union Select  2, 2 
              Union Select 3, 3),
  t2 as (Select -1 as Col1, 'A' 
         Union Select -2, 'B')

  Select count(t1.*), count(t2.*) from t1 full outer join t2 on t1.Col1 = t2.Col1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM