简体   繁体   中英

MySQL Union Performance Issue

I have 3 tables (SuccessOrder, FailedOrder, and PendingOrder) with same columns. Each table is having more than 2 million records. I need to combine all the data from these 3 tables and sort them by CreatedDate to display in my portal. I am using UNION ALL to combine the result from all tables.

If I execute each subquery, getting the result in 1 or 2 seconds. If I execute the whole query (3 subqueries with UNION ALL), it is taking more than 5 minutes.

select * from (
select * from SuccessOrder
UNION ALL
select * from FailedOrder
UNION ALL
select * from PendingOrder
) t order by t.ID;

Is there any alternative to UNION ALL ?

Is it possible to create view from 3 queries without UNION ALL ?

Following are the Individual and union queries tested from workbench. I don't see much difference between UNION ALL and UNION ALL - ORDER BY

First Query

Duration/Fetch Time : 2.182 sec / 1.513 sec

SELECT  col1, col2, ...
    FROM  CompleteTxn ct
    left outer join  CompleteItem ci  ON (ct.Id = ci.TxnId)
    left outer join  ItemDispute id  ON (ct.Id = id.TxnId and  ci.Id = id.ItemId )
    left outer join  Merchant mc  ON (ct.MerchantId = mc.Id)
    left outer join  AdditionalTxnInfo addti  ON (ct.Id = addti.TxnId)
    where  (ct.PartitionKey>=55 AND  ct.PartitionKey<=56 )
      and  ( ct.TxnCompleteTime >= '2016-08-01 17:00:00'
        and  ct.TxnCompleteTime <= '2016-08-09 17:00:00'
           )
      and  ct.MnoId in (22,24,25,23,26,220,221,200,223,224,320, 400)
      and  ct.Status in (1,2,3,4,5);

Second Query

Duration/Fetch Time : 0.279 sec / 0.861 sec

SELECT  col1, col2, ...
    FROM  FailedOrder ct
    left outer join  FailedItem ci  ON (ct.Id = ci.TxnId)
    left outer join  ItemDispute id  ON (ct.Id = id.TxnId and  ci.Id = id.ItemId )
    left outer join  Merchant mc  ON (ct.MerchantId = mc.Id)
    left outer join  AdditionalTxnInfo addti  ON (ct.Id = addti.TxnId)
    where  (ct.PartitionKey>=55 AND  ct.PartitionKey<=56 )
      and  ( ct.TxnCompleteTime >= '2016-08-01 17:00:00'
        and  ct.TxnCompleteTime <= '2016-08-09 17:00:00'
           )
      and  ct.MnoId in (22,24,25,23,26,220,221,200,223,224,320, 400);

Union All withour order by

Duration/Fetch Time : 104.802 sec / 0.00027 sec

select  *
    FROM  
    (
        SELECT  col1, col2, ...
            FROM  FailedOrder ct
            left outer join  FailedItem ci  ON (ct.Id = ci.TxnId)
            left outer join  ItemDispute id  ON (ct.Id = id.TxnId
                      and  ci.Id = id.ItemId 
                          )
            left outer join  Merchant mc  ON (ct.MerchantId = mc.Id)
            left outer join  AdditionalTxnInfo addti  ON (ct.Id = addti.TxnId)
    where  (ct.PartitionKey>=55 AND  ct.PartitionKey<=56 )
      and  ( ct.TxnCompleteTime >= '2016-08-01 17:00:00'
        and  ct.TxnCompleteTime <= '2016-08-09 17:00:00'
           )
      and  ct.MnoId in (22,24,25,23,26,220,221,200,223,224,320, 400)
              and  ct.Status in (1,2,3,4,5)
    UNION  ALL 
        SELECT  col1, col2, ...
            FROM  CompleteTxn ct
            left outer join  CompleteItem ci  ON (ct.Id = ci.TxnId)
            left outer join  ItemDispute id  ON (ct.Id = id.TxnId
                      and  ci.Id = id.ItemId
                          )
            left outer join  Merchant mc  ON (ct.MerchantId = mc.Id)
            left outer join  AdditionalTxnInfo addti  ON (ct.Id = addti.TxnId)
    where  (ct.PartitionKey>=55 AND  ct.PartitionKey<=56 )
      and  ( ct.TxnCompleteTime >= '2016-08-01 17:00:00'
        and  ct.TxnCompleteTime <= '2016-08-09 17:00:00'
           )
      and  ct.MnoId in (22,24,25,23,26,220,221,200,223,224,320, 400)        ) t ;

Union All with order by

Duration/Fetch Time : 104.895 sec / 0.00028 sec

select * FROM (

SELECT col1, col2, ... FROM FailedOrder ct left outer join FailedItem ci ON (ct.Id = ci.TxnId) left outer join ItemDispute id ON (ct.Id = id.TxnId and ci.Id = id.ItemId) left outer join Merchant mc ON (ct.MerchantId = mc.Id) left outer join AdditionalTxnInfo addti ON (ct.Id = addti.TxnId) where (ct.PartitionKey>=55 AND ct.PartitionKey<=56) and ( ct.TxnCompleteTime >= '2016-08-01 17:00:00' and ct.TxnCompleteTime <= '2016-08-09 17:00:00') and ct.MnoId in (22,24,25,23,26,220,221,200,223,224,320,400) and ct.Status in (1,2,3,4,5)

UNION ALL

SELECT col1, col2, ... FROM CompleteTxn ct left outer join CompleteItem ci ON (ct.Id = ci.TxnId) left outer join ItemDispute id ON (ct.Id = id.TxnId and ci.Id = id.ItemId) left outer join Merchant mc ON (ct.MerchantId = mc.Id) left outer join AdditionalTxnInfo addti ON (ct.Id = addti.TxnId) where (ct.PartitionKey>=55 AND ct.PartitionKey<=56) and ( ct.TxnCompleteTime >= '2016-08-01 17:00:00' and ct.TxnCompleteTime <= '2016-08-09 17:00:00') and ct.MnoId in (22,24,25,23,26,220,221,200,223,224,320,400)

) t ORDER BY id desc;

As per my assumption you are dealing with Huge data almost 6 million records.

1)when comparing to Union , Union All is far more faster

2)you are taking all the select statements into Derived table .

3)Again you are doing ORDER BY on ID ( performance Issue )

It will give huge performance impact if you do Order by on Huge Data.

All the Result will get Ordered in Sequential manner so Definitely SORTING COST Percentage will be increased

select Col1,Col2 from SuccessOrder
UNION ALL
select Col1,Col2  from FailedOrder
UNION ALL
select Col1,Col2  from PendingOrder

Your order by is most likely the culprit here. You're essentially ordering your whole database after select, and depending how its indexed, that can take a long time.

You could:

  • make sure the tables are indexed on what you order by (important)
  • remove the order by alltogether

I don't think that the union all cost comes close to the cost of sorting all that data.

I will plead ignorance as to why union is so long compared to the selects, someone far smarter than me will need to testify to that. I repeated the results you obtained with my own database with a similar overhead. This being said, if your sorting off a subselect, you lose the indexing of the tables

select * from (
(select * from SuccessOrder order by ID limit 100)
UNION ALL
(select * from FailedOrder  order by ID limit 100)
UNION ALL
(select * from PendingOrder order by ID limit 100)
) t order by t.ID desc limit 100;

would take advantage of the indexing and return your 100 without too much overhead. If you wanted all 6 million rows paged... that's slightly different, but a straight up last 100 is reasonable, but this depends on how your id column is generated, if its the equivalent of a record number, this answer helps no one.

Some of these tips may help:

  • You don't need the outer SELECT
  • Don't return millions of rows; process them in MySQL, then deliver only the ones you need, or summarize the you need.
  • If you are going to use only the first few, see documentation on UNION
  • Minimize the number of columns to return. The extra bulk of extra columns is part of the slowdown.
  • Change (ct.PartitionKey>=55 AND ct.PartitionKey<=56 ) to ct.PartitionKey IN (55,56) if practical
  • In ct.TxnCompleteTime <= '2016-08-09 17:00:00' , change <= to < .
  • Have an index with one of the INs first and TxnCompleteTime second.
  • Consider putting all the data in one table.

Please provide SHOW CREATE TABLE .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM