简体   繁体   中英

Better SQL to sum same columns across multiple tables?

I am trying to get sum for some columns from across multiple mysql tables using python/sqlalchemy. The number of tables is dynamic, and each table has same schema.

Table_1
| col1 | col2| ... |

Table_2
| col1 | col2| ... |

Table_...
| col1 | col2| ... |

I studied sqlachemy, and realised that the better idea might be to generate a SQL text and execute it, creating models might not be a good solution, I feel that may introduce additional cost on performance, I prefer a single SQL statement.

select (t1.col1 + t2.col1 + t3.col1 + t?.col1 ...) as col1, (t1.col2 + t2.col2 + ...) as col2,
 ... from
(select sum(col1), sum(col2), sum(col3) ... from Table_1 as t1,
 select sum(col1), sum(col2), sum(col3) ... from Table_2 as t2,
 ...
)

The above is the SQL I intend to make using python. I am not a SQL professional, so I am not sure if that is a good statement, and I am wondering if there are any better solution, simpler and efficient, other than this?

Your general approach looks reasonable. Getting the SUMs from the individual tables as a single row, and the combining those, is the most efficient approach. There's just a couple of minor fixes.

It looks like you will need to provide an alias for each of the SUM() expression returned.

And you're going to need to wrap the SELECT from each table in a set of parens, and give each of those inline views an alias.

Also, there's a potential for one of the inner SUM() expressions to return a NULL, so the addition performed in the outer query could return a NULL. One fix for that would be wrap the inner SUM expressions in a IFNULL or COALESCE, to replace a NULL with a zero, but that could introduces a zero where the outer SUM would really be a NULL.

Personally, I'd avoid using the comma notation for the JOIN operation. The comma is valid, but I'd write it out using the CROSS JOIN keywords, to make it a little more readable.

But my preference would be avoid the JOIN and the addition operations in the outer query. I'd use a SUM aggregate in the outer query, something like this:

SELECT SUM(t.col1_tot) AS col1_tot
     , SUM(t.col2_tot) AS col2_tot
     , SUM(t.col3_tot) AS col3_tot
  FROM ( SELECT SUM(col1) AS col1_tot
              , SUM(col2) AS col2_tot 
              , SUM(col3) AS col3_tot 
           FROM table1
          UNION ALL
         SELECT SUM(col1) AS col1_tot
              , SUM(col2) AS col2_tot 
              , SUM(col3) AS col3_tot 
           FROM table2
          UNION ALL
         SELECT SUM(col1) AS col1_tot
              , SUM(col2) AS col2_tot
              , SUM(col3) AS col3_tot 
           FROM table3
       ) t

That avoids anomalies with NULL values, and makes it return the same values that would be returned if the the individual tables were all concatenated together. But this isn't any more efficient than what you have.


To use the JOIN method, as in your query (if I don't mind returning a zero where a NULL would have been returned in the query above, to that approach to work:

SELECT t1.col1_tot + t2.col1_tot + t3.col1_tot  AS col1_tot
     , t1.col2_tot + t2.col2_tot + t3.col2_tot  AS col2_tot
     , t1.col3_tot + t2.col3_tot + t3.col3_tot  AS col3_tot
  FROM ( SELECT IFNULL(SUM(col1),0) AS col1_tot
              , IFNULL(SUM(col2),0) AS col2_tot 
              , IFNULL(SUM(col3),0) AS col3_tot 
           FROM table1
       ) t1
 CROSS  
  JOIN ( SELECT IFNULL(SUM(col1),0) AS col1_tot
              , IFNULL(SUM(col2),0) AS col2_tot 
              , IFNULL(SUM(col3),0) AS col3_tot 
           FROM table2
       ) t2
 CROSS  
  JOIN ( SELECT IFNULL(SUM(col1),0) AS col1_tot
              , IFNULL(SUM(col2),0) AS col2_tot 
              , IFNULL(SUM(col3),0) AS col3_tot
       ) t3

But, again, my personal preference would be to avoid doing those addition operations in the outer query. I'd use the SUM aggregate, and UNION the results from the individual tables, rather than doing a join.

Unless you have some where clauses to join those tables together, you're going to end up with a cartesian join, where every record from each table in the query is joined against all other combinations of records from the other tables. so if each of those tables has (say) 1000 records, and you've got 5 tables in the query, you're going to end up with 1000^5 = 1,000,000,000,000,000 records in the result set.

What you want is probably something more like this:

SELECT sum(col1) AS sum1, sum(col2) AS sum2, ....
FROM (
   SELECT col1, col2, col3, ... FROM table1
   UNION ALL
   SELECT col1, col2, col3, ... FROM table2
   UNION ALL
   ...
) a

The inner UNION join will take all the columns from each of those tables and turn them into a single contiguous result set. The outer query will then take each of those columns and sum up the values.

This may help u,

select SUM(col1),SUM(col2) from
(
select col1,col2 from Table1 
union all
select col1,col2 from Table2
union all
select col1,col2 from Table3 
)t 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM