簡體   English   中英

改進通過聯接更新大表的性能

[英]Improving performance of updating large table with join

目前,我有一個具有架構的表,如下所示:

 mData | CREATE TABLE `mData` (
   `m1` mediumint(8) unsigned DEFAULT NULL,
   `m2` smallint(5) unsigned DEFAULT NULL,
   `m3` bigint(20) DEFAULT NULL,
   `m4` tinyint(4) DEFAULT NULL,
   `m5` date DEFAULT NULL,
   KEY `m_m1` (`m1`) USING HASH,
   KEY `m_date` (`m5`),
   KEY `m_m2` (`m2`),
   KEY `m_combined` (`m1`,`m2`,`m5`),
   KEY `m1_tradeday` (`m1`,`m5`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
 /*!50100 PARTITION BY RANGE ( YEAR(m5))
 SUBPARTITION BY HASH (MONTH(m5))
 (PARTITION p2013 VALUES LESS THAN (2014)
  (SUBPARTITION dec_2013 ENGINE = InnoDB,
   SUBPARTITION jan_2013 ENGINE = InnoDB,
   SUBPARTITION feb_2013 ENGINE = InnoDB,
   SUBPARTITION mar_2013 ENGINE = InnoDB,
   SUBPARTITION apr_2013 ENGINE = InnoDB,
   SUBPARTITION may_2013 ENGINE = InnoDB,
   SUBPARTITION jun_2013 ENGINE = InnoDB,
   SUBPARTITION jul_2013 ENGINE = InnoDB,
   SUBPARTITION aug_2013 ENGINE = InnoDB,
   SUBPARTITION sep_2013 ENGINE = InnoDB,
   SUBPARTITION oct_2013 ENGINE = InnoDB,
  SUBPARTITION nov_2013 ENGINE = InnoDB),
  PARTITION p2014 VALUES LESS THAN (2015)
  (SUBPARTITION dec_2014 ENGINE = InnoDB,
   SUBPARTITION jan_2014 ENGINE = InnoDB,
   SUBPARTITION feb_2014 ENGINE = InnoDB,
   SUBPARTITION mar_2014 ENGINE = InnoDB,
   SUBPARTITION apr_2014 ENGINE = InnoDB,
   SUBPARTITION may_2014 ENGINE = InnoDB,
   SUBPARTITION jun_2014 ENGINE = InnoDB,
   SUBPARTITION jul_2014 ENGINE = InnoDB,
   SUBPARTITION aug_2014 ENGINE = InnoDB,
   SUBPARTITION sep_2014 ENGINE = InnoDB,
   SUBPARTITION oct_2014 ENGINE = InnoDB,
   SUBPARTITION nov_2014 ENGINE = InnoDB),
  PARTITION p2015 VALUES LESS THAN (2016)
  (SUBPARTITION dec_2015 ENGINE = InnoDB,
   SUBPARTITION jan_2015 ENGINE = InnoDB,
   SUBPARTITION feb_2015 ENGINE = InnoDB,
   SUBPARTITION mar_2015 ENGINE = InnoDB,
   SUBPARTITION apr_2015 ENGINE = InnoDB,
   SUBPARTITION may_2015 ENGINE = InnoDB,
   SUBPARTITION jun_2015 ENGINE = InnoDB,
   SUBPARTITION jul_2015 ENGINE = InnoDB,
   SUBPARTITION aug_2015 ENGINE = InnoDB,
   SUBPARTITION sep_2015 ENGINE = InnoDB,
   SUBPARTITION oct_2015 ENGINE = InnoDB,
   SUBPARTITION nov_2015 ENGINE = InnoDB),
  PARTITION p2016 VALUES LESS THAN (2017)
  (SUBPARTITION dec_2016 ENGINE = InnoDB,
   SUBPARTITION jan_2016 ENGINE = InnoDB,
   SUBPARTITION feb_2016 ENGINE = InnoDB,
   SUBPARTITION mar_2016 ENGINE = InnoDB,
   SUBPARTITION apr_2016 ENGINE = InnoDB,
   SUBPARTITION may_2016 ENGINE = InnoDB,
   SUBPARTITION jun_2016 ENGINE = InnoDB,
   SUBPARTITION jul_2016 ENGINE = InnoDB,
   SUBPARTITION aug_2016 ENGINE = InnoDB,
   SUBPARTITION sep_2016 ENGINE = InnoDB,
   SUBPARTITION oct_2016 ENGINE = InnoDB,
   SUBPARTITION nov_2016 ENGINE = InnoDB),
  PARTITION pmax VALUES LESS THAN MAXVALUE
  (SUBPARTITION dec_max ENGINE = InnoDB,
   SUBPARTITION jan_max ENGINE = InnoDB,
   SUBPARTITION feb_max ENGINE = InnoDB,
   SUBPARTITION mar_max ENGINE = InnoDB,
   SUBPARTITION apr_max ENGINE = InnoDB,
   SUBPARTITION may_max ENGINE = InnoDB,
   SUBPARTITION jun_max ENGINE = InnoDB,
   SUBPARTITION jul_max ENGINE = InnoDB,
   SUBPARTITION aug_max ENGINE = InnoDB,
   SUBPARTITION sep_max ENGINE = InnoDB,
   SUBPARTITION oct_max ENGINE = InnoDB,
   SUBPARTITION nov_max ENGINE = InnoDB)) */ |

在此表中將m1,m2和m5設置為索引,在我的情況下,唯一/主變量不適用。

隨着數據變得越來越大(每天增加100,000個新行),update命令變得越來越慢。

我想知道是否有任何方法可以改善以下陳述。

update mData as a join (select * from mData
                        where m1 = 326 and m5 = '2015-   07-06' ) as b
            on  a.m5 > b.m5 and a.m1 = b.m1
            and a.m2 = b.m2 and a.m3 = b.m3
    set a.m4 = 0;

我非常確定,在select語句中,如果我將mData as a替換mData as a to (select * from mData where m1 = 326) ,執行時間將大大減少(從5秒減少到不到1秒)。

但是,不可能在UPDATE語句中執行相同的操作。

有什么解決方案可以加快更新速度嗎?

PS該表已按月(m5)和年(m5)進行了分區

這是我的聯接查詢的EXPLAIN分區,非常混亂,希望您不要介意。 添加'和a.m5>'2015-07-06'確實可以提高性能,查詢時間從0.68秒降至0.2秒。

explain partitions (select * from (select * from mData where m1 = 326) as a join (select * from mData where m1 = 326 and m5= '2015-07-06') as b on  a.m5 > b.m5 and a.m1 = b.m1 and a.m2 = b.m2 and a.m3 = b.m3 and a.m5 > '2015-07-06');

| id | select_type | 桌子| 隔板| 類型 可能的鑰匙| 關鍵 key_len | 參考| 行| 額外|| 1 | 主要| | NULL | 全部| NULL | NULL | NULL | NULL | 358 | | | 1 | 主要| | NULL | 全部| NULL | NULL | NULL | NULL | 1073 | 在哪里使用 使用連接緩沖區 | 3 | 派生| mData | p2015_jul_2015 | 參考| m_m1,m_m5,m_combined,m1_m5 | m1_m5 | 8 | | 357 | 在哪里使用 | 2 | 派生| mData | p2013_dec_2013,p2013_jan_2013,p2013_feb_2013,P 2013_mar_2013,p2013_apr_2013,p2013_may_2013,p2013_jun_2013,p2013_jul_2013,p2013_ aug_2013,p2013_sep_2013,p2013_oct_2013,p2013_nov_2013,p2014_dec_2014,p2014_jan_2 014,p2014_feb_2014,p2014_mar_2014,p2014_apr_2014,p2014_may_2014,p2014_jun_2014,P 2014_jul_2014,p2014_aug_2014,p2014_sep_2014,p2014_oct_2014, p2014_nov_2014,p2015_ dec_2015,p2015_jan_2015,p2015_feb_2015,p2015_mar_2015,p2015_apr_2015,p2015_may_2 015,p2015_jun_2015,p2015_jul_2015,p2015_aug_2015,p2015_sep_2015,p2015_oct_2015,p 2015_nov_2015,p2016_dec_2016,p2016_jan_2016,p2016_feb_2016,p2016_mar_2016,p2016_ apr_2016,p2016_may_2016,p2016_jun_2016,p2016_jul_2016,p2016_aug_2016,p2016_sep_2 016 ,p2016_oct_2016,p2016_nov_2016,pmax_dec_max,pmax_jan_max,pmax_feb_max,pmax_ma r_max,pmax_apr_max,pmax_may_max,pmax_jun_max,pmax_jul_max,pmax_aug_max,pmax_sep_ max,pmax_nov_ | 參考| m_m1,m_combined,m1_m5 | m_m1 | 4 | | 1074 | 在哪里使用

以下是“ Rick James”詢問的查詢解釋

EXPLAIN PARTITIONS select * from ccass_data where sid = 326 and trade_day = '2015-07-06';

| id | select_type | table      | partitions     | type | possible_keys                                    | key          | key_len | ref         | rows | Extra       |
 +----+-------------+------------+----------------+------+--------------------------------------------------+--------------+---------+-------------+------+-------------+
 |  1 | SIMPLE      | mData     | p2015_jul_2015 | ref  | m_m1,m_m5,m_combined,m1_m5               | m1_m5 | 8    | const,const |  357    | Using where        |

首先,我將使用m5的固定值來限制要考慮的分區。 也許您還應該在year(m5)和month(m5)上添加一個虛擬條件。 然后,我將為子查詢創建一個臨時表,並在m2和m3上創建一個索引。 然后,我將對m1和m5使用固定值。 但是查詢執行了多少次? 5秒不是一個可怕的結果。

對於初學者,添加INDEX(m1, m5) 在看到SHOW CREATE TABLE mData; ,我可能還有其他建議。

編輯

添加AND a.m5 > '2015-07-06' 可能會啟動分區修剪。我沒有使用UPDATESUBPARTITION進行預測的經驗。

InnoDB 必須具有一個PRIMARY KEY (m1, m2, m3, m5)可以用作PK?

因為InnoDB沒有實現,所以將忽略USING HASH 無論如何,它將是一個幾乎一樣的BTree。

KEY `m_m1` (`m1`)

是多余的,可以刪除,因為有另一個(實際上是兩個)索引開始

您不能執行JOIN而不是使用子查詢嗎? (這樣可以避免使用tmp表。)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM