简体   繁体   中英

SQL Optimization: multiplication of two calculated field generated by window functions

Given two time-series tables tbl1(time, b_value) and tbl2(time, u_value) .

https://www.db-fiddle.com/f/4qkFJZLkZ3BK2tgN4ycCsj/1

Suppose we want to find the last value of u_value in each day, the daily cumulative sum of b_value on that day, as well as their multiplication, ie daily_u_value * b_value_cum_sum .

The following query calculates the desired output:

WITH cte AS (
SELECT
  t1.time,
  t1.b_value,
  t2.u_value * t1.b_value AS bu_value,
  last_value(t2.u_value) 
  OVER 
  (PARTITION BY DATE_TRUNC('DAY', t1.time) ORDER BY DATE_TRUNC('DAY', t2.time) ) AS daily_u_value
FROM stackoverflow.tbl1 t1
LEFT JOIN stackoverflow.tbl2 t2
ON 
    t1.time = t2.time
)
SELECT
    DATE_TRUNC('DAY', c.time) AS time,
    AVG(c.daily_u_value) AS daily_u_value, 
    SUM( SUM(c.b_value)) OVER (ORDER BY DATE_TRUNC('DAY', c.time) ) as b_value_cum_sum,
    AVG(c.daily_u_value) * SUM( SUM(c.b_value) ) OVER (ORDER BY DATE_TRUNC('DAY', c.time) ) as daily_u_value_mul_b_value
FROM cte c
GROUP BY 1
ORDER BY 1 DESC

I was wondering what I can do to optimize this query? Is there any alternative solution that generates the same result?

db filddle demo
from your query: Execution Time: 250.666 ms to my query Execution Time: 205.103 ms
seems there is some progress there. Mainly reduce the time of cast, since I saw your have many times cast from timestamptz to timestamp. I wonder why not just another date column.


I first execute my query then yours, which mean the compare condition is quite fair, since second time execute generally more faster than first time.


alter table tbl1 add column t1_date date;
alter table tbl2 add column t2_date date;
update tbl1 set t1_date = time::date;
update tbl2 set t2_date = time::date;

WITH cte AS (
SELECT
  t1.t1_date,
  t1.b_value,
  t2.u_value * t1.b_value AS bu_value,
  last_value(t2.u_value)
  OVER
  (PARTITION BY t1_date ORDER BY t2_date ) AS daily_u_value
FROM stackoverflow.tbl1 t1
LEFT JOIN stackoverflow.tbl2 t2
ON
    t1.time = t2.time
)
SELECT
    t1_date,
    AVG(c.daily_u_value) AS daily_u_value,
    SUM( SUM(c.b_value)) OVER (ORDER BY t1_date ) as b_value_cum_sum,
    AVG(c.daily_u_value) * SUM( SUM(c.b_value) ) OVER
        (ORDER BY t1_date ) as daily_u_value_mul_b_value
FROM cte c
GROUP BY 1
ORDER BY 1 DESC

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM