簡體   English   中英

如何提高 postgresql 查詢的性能?

[英]How can I improve the performance of my postgresql query?

我有一個查詢,它返回按帳戶分組的時間間隔內的買入、賣出和轉賬的總和,問題是它很慢,我只在過去 24 小時內進行交易,我希望能夠運行這適用於所有交易(2 年內 800,000 筆)。 我該如何優化呢?

select
    i.interval, ca.contract_address,
    coalesce(SUM(t.amount) FILTER (WHERE t.action = 0), 0) as amount_ampl_bought,
    coalesce(SUM(t.amount) FILTER (WHERE t.action = 1), 0) as amount_ampl_sold,
    coalesce(SUM(t.amount) FILTER (WHERE t.action = 2), 0) as amount_ampl_transferred,
    coalesce(SUM(t.supply_percentage) FILTER (WHERE t.action = 0), 0) as percent_ampl_bought,
    coalesce(SUM(t.supply_percentage) FILTER (WHERE t.action = 1), 0) as percent_ampl_sold,
    coalesce(SUM(t.supply_percentage) FILTER (WHERE t.action = 2), 0) as percent_ampl_transferred
from
    (
        select contract_address
        from addresses a
        where not exists (select 1 from address_tags at where at.address = a.contract_address and at.tag_id = 3)
    ) ca
cross join
    (
        SELECT date_trunc('hour', dd) as interval
        FROM generate_series
        (
            (now() at time zone 'utc') - interval '1 day',
            (now() at time zone 'utc'),
            '1 hour'::interval
        ) dd
    ) i
left join transfers t on (t.from = ca.contract_address or t.to = ca.contract_address) and date_trunc('hour', t.timestamp at time zone 'utc') = i.interval
group by i.interval, ca.contract_address;

示例 output:

      interval       |              contract_address              | amount_ampl_bought | amount_ampl_sold | amount_ampl_transferred |     percent_ampl_bought     |     percent_ampl_sold      |  percent_ampl_transferred  
---------------------+--------------------------------------------+--------------------+------------------+-------------------------+-----------------------------+----------------------------+----------------------------
 2021-05-08 11:00:00 | 0x0000000000000000000000000000000000000000 |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x000000000000000000000000000000000000dead |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x000000000000006f6502b7f2bbac8c30a3f67e9a |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x000000000000084e91743124a982076c59f10084 |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x0000000000000eb4ec62758aae93400b3e5f7f18 |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x00000000000017c75025d397b91d284bbe8fc7f2 |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x0000000000005117dd3a72e64a705198753fdd54 |                  0 |                0 |                       0 |                           0 |                          0 |                          0
 2021-05-08 11:00:00 | 0x000000000000740a22fa209cf6806d38f7605385 |                  0 |                0 |                       0 |                           0 |                          0 |                          0

鏈接到可視化查詢:

https://explain.depesz.com/s/SrLf

我在傳輸中創建的索引:

 CREATE INDEX transfers_from_to_index ON public.transfers USING btree ("from", "to")
 CREATE INDEX transfers_timestamp_index ON public.transfers USING btree ("timestamp")
 CREATE INDEX transfers_action_index ON public.transfers USING btree (action)
 CREATE UNIQUE INDEX transfers_pkey ON public.transfers USING btree (transaction_hash, log_index)
 CREATE INDEX transfers_supply_percentage_index ON public.transfers USING btree (supply_percentage)
 CREATE INDEX transfers_amount_index ON public.transfers USING btree (amount)
 CREATE INDEX transfers_supply_percentage_timestamp_log_index_index ON public.transfers USING btree (supply_percentage, "timestamp", log_index)
 CREATE INDEX transfers_date_trunc_idx ON public.transfers USING btree (date_trunc('hour'::text, timezone('utc'::text, "timestamp")))
 CREATE INDEX transfers_to_index ON public.transfers USING btree ("to")

我在地址上創建的索引:

 CREATE UNIQUE INDEX addresses_pkey ON public.addresses USING btree (contract_address)
 CREATE INDEX addresses_supply_percentage_index ON public.addresses USING btree (supply_percentage)

非常感謝您對此優化的幫助!

我很確定問題是transfersJOIN條件中的or 在合理的假設下,您應該能夠將其拆分為兩個單獨的left join

select i.interval, a.contract_address,
       coalesce(SUM(tt.amount, tf.amount) FILTER (WHERE COALESCE(tt.action, tf.acount) = 0), 0) as amount_ampl_bought,
       coalesce(SUM(tt.amount, tf.amount) FILTER (WHERE COALESCE(tt.action, tf.acount) = 1), 0) as amount_ampl_sold,
       coalesce(SUM(tt.amount, tf.amount) FILTER (WHERE COALESCE(tt.action, tf.acount) = 2), 0) as amount_ampl_transferred,
       coalesce(SUM(tt.supply_percentage, tf.supply_percentage) FILTER (WHERE COALESCE(tt.action, tf.acount) = 0), 0) as percent_ampl_bought,
       coalesce(SUM(tt.supply_percentage, tf.supply_percentage) FILTER (WHERE COALESCE(tt.action, tf.acount) = 1), 0) as percent_ampl_sold,
       coalesce(SUM(tt.supply_percentage, tf.supply_percentage) FILTER (WHERE COALESCE(tt.action, tf.acount) = 2), 0) as percent_ampl_transferred
from addresses a cross join
     generate_series(date_trunc('hour', (now() at time zone 'utc') - interval '1 hour'),
                     date_trunc('hour', now() at time zone 'utc'),
                     '1 hour'::interval
                    ) i left join
      transfers tf
      on tf.from = ca.contract_address and
         date_trunc('hour', tf.timestamp at time zone 'utc') = i.interval left join
      transfers tt
      on t.to = ca.contract_address and
         date_trunc('hour', tt.timestamp at time zone 'utc') = i.interval
where not exists (select 1
                  from address_tags at
                  where at.address = a.contract_address and at.tag_id = 3
                 )
group by i.interval, ca.contract_address;

然后對於此查詢,您需要以下索引:

  • address_tags(address, tag_id)
  • transfers(to, timestamp)
  • transfers(from, timestamp)

(請注意, tofrom是非常糟糕的列名稱,因為它們是 SQL 關鍵字。)

timetamp到 UTC 的轉換也可能會造成問題。 我建議您修復您的數據,以便時間戳都在一個共同的時區中——為此我建議使用 UTC(以避免夏令時問題)。

看起來它已經在所有時間段內完成了大部分工作,只是在完成大部分工作后過濾掉了您沒有要求的工作。 所以如果你想要一個不同的時間段,那就去做吧。 如果這仍然太慢,然后發布計划。 那么至少我們會優化正確的查詢。

你能在下面試一試嗎? AFAIK 沒有理由將所有內容都塞進 1 個查詢中,所以我拆分了其中的一些部分。 我還將or分成兩部分,它應該可以更好地使用索引。 然后注意到這正是 Gordon 在上面所做的(到目前為止,我認為找到一種可能比 UNION ALL 更快的解決方法非常聰明=)

還添加了 WHERE on action,不確定是否有除 0、1、2 以外的其他值。如果沒有,您可以再次刪除它。

PS:這里未經測試和盲目工作,只是好奇(和充滿希望=)

DROP TABLE IF EXISTS _combined;

WITH intervals
  AS ( 
       SELECT i as interval            
          FROM generate_series(
                                date_trunc('hour', (now() at time zone 'utc') - interval '1 day'),
                                date_trunc('hour', (now() at time zone 'utc')),
                                '1 hour'::interval
                            ) ,
     adrs 
  AS (
        SELECT a.contract_address
          FROM addresses a 
        EXCEPT
        SELECT at.address 
          FROM address_tags at
         WHERE at.tag_id = 3)
         
SELECT a.contract_address, i.interval
  INTO TEMPORARY TABLE _combined
  FROM intervals i
 CROSS JOIN adrs a
           
CREATE UNIQUE INDEX uq_combined ON _combined (interval, contract_address)

SELECT c.interval, 
       c.contract_address,
       COALESCE(SUM(COALESCE(tf.amount           , tt.amount           , 0)) FILTER (WHERE t.action = 0), 0) as amount_ampl_bought,
       COALESCE(SUM(COALESCE(tf.amount           , tt.amount           , 0)) FILTER (WHERE t.action = 1), 0) as amount_ampl_sold,
       COALESCE(SUM(COALESCE(tf.amount           , tt.amount           , 0)) FILTER (WHERE t.action = 2), 0) as amount_ampl_transferred,
       COALESCE(SUM(COALESCE(tf.supply_percentage, tt.supply_percentage, 0)) FILTER (WHERE t.action = 0), 0) as percent_ampl_bought,
       COALESCE(SUM(COALESCE(tf.supply_percentage, tt.supply_percentage, 0)) FILTER (WHERE t.action = 1), 0) as percent_ampl_sold,
       COALESCE(SUM(COALESCE(tf.supply_percentage, tt.supply_percentage, 0)) FILTER (WHERE t.action = 2), 0) as percent_ampl_transferred
  FROM _combined c

  LEFT OUTER JOIN transfers tf 
               ON tf.from = c.contract_address  
              AND date_trunc('hour', tf.timestamp at time zone 'utc') = c.interval
              AND tf.action IN (0, 1, 2)

  LEFT OUTER JOIN transfers tt 
               ON tt.to = c.contract_address 
              AND date_trunc('hour', tt.timestamp at time zone 'utc') = c.interval
              AND tt.action IN (0, 1, 2)
       
group by c.interval, c.contract_address;

此查詢的理想索引是:

CREATE INDEX transfers_date_trunc_to_idx ON public.transfers USING btree (date_trunc('hour'::text, timezone('utc'::text, "timestamp"), to)) INCLUDE (action, amount, supply_percentage) 
CREATE INDEX transfers_date_trunc_from_idx ON public.transfers USING btree (date_trunc('hour'::text, timezone('utc'::text, "timestamp"), from)) INCLUDE (action, amount, supply_percentage)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM