简体   繁体   中英

MySQL ORDER BY extremely slow. How to optimize?

I am looking for a solution for my problem with extremely slow queries and I hope you can help me.

First of all, what I want to do is the following:
I've got a table, lets name it item_table with some information about goods. Every item has an orderid . The problem here is: this order id is not unique .
Every change of an item is such to say "progressively" recorded. Here is an example:

             order_id     max_vol   remain_vol
Purchase 1   2007468329   8753      4126
Purchase 2   2007468329   8753      4122
Purchase 3   2007468329   8753      4006

To explain that a bit:
Every time a person buys an item there is a new entry with the same order id and a changed remaining volume ( remain_vol ). The max_vol is the total volume which the seller entered on stock at beginning. An item can have multiple order ids, but everytime a seller inserts something (even if it is the same item) it gets a new order id.

What I now want to do is the following:
I want to get the item with the most sold units. That means I only want to get the difference between MAX(remain_vol) and MIN(remain_vol) and I only want to get items where anything got sold (=max_vol != remain_vol)

To get a bit more specific:
Here is the create table for my data table:

CREATE TABLE `data` (    
    `orderid` bigint(20) DEFAULT NULL,    
    `regionid` int(11) DEFAULT NULL,    
    `systemid` int(11) DEFAULT NULL,
    `stationid` int(11) DEFAULT NULL,
    `typeid` int(11) DEFAULT NULL,
    `bid` int(11) DEFAULT NULL,
    `price` float DEFAULT NULL,
    `minvolume` int(11) DEFAULT NULL,
    `volremain` int(11) DEFAULT NULL,
    `volenter` int(11) DEFAULT NULL,
    `issued` datetime DEFAULT NULL,
    `duration` varchar(32) DEFAULT NULL,
    `range` int(11) DEFAULT NULL,
    `reportedby` int(11) DEFAULT NULL,
    `reportedtime` datetime DEFAULT NULL,
      KEY `orderid` (`orderid`) USING BTREE,
      KEY `volremain` (`volremain`) USING BTREE,
      KEY `volenter` (`volenter`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8

The column max_row I mentioned is in the table volenter , remain_vol is volremain .

This table contains about 60 million entries .

Does anyone have an idea how to solve this problem?
I already tried some queries but they all take ages to execute.

Kind regards and hoping for a solution
- Lyrex

I think the problem is the way your table is structured. When a user buys an item you shouldn't be appending a new entry to the table. There are many things wrong with this approach.

First of all, orders should have a unique order ID, unless there's a really good reason not to. What you should do instead is make order ID unique, and give it the fields init_vol , max_vol , and sold . When a user buys an item, you increment the field sold . When you want to get the most sold units, you order by sold descending.

This way you are not growing the table unnecessarily. All your queries become much simpler and faster.

Per Questions asked, and a little bit of assumptions, hopefully this answer can help you.

I would create a covering index on your data table of

( typeid, orderid, remain_vol )

Not knowing the basis of the columns, I am assuming (yeah I know assume), that the TYPEID is some sort of indicator as to a buy OR sell. if you are looking only for 'sell', then this can help the query. By also having the orderid will help the grouping, and the remain_vol column prevents the need to go back to the raw data pages to apply your query.

I would also have a covering index on your "item_table" too something like

(orderid, item)

so it can be efficiently joined to the result sell orders, and the item (such as stock name) can be there too for quick reference without going to the raw data pages.

That said, I would then try something like

SELECT
      t.item,
      SUM( PreAgg.MaxVol ) as TotalVolPerItem,
      SUM( PreAgg.MinVol ) as TotalRemainingToSell
   from
      item_table t
      JOIN (SELECT
                  d.orderid,
                  MAX( d.remain_vol ) as MaxVol,
                  MIN( d.remain_vol ) as MinVol
               from
                  data d
               where
                  d.typeid = 'sell'  (or whatever flag indicator if this is correct assumption)
               group by
                  d.orderid
               having
                  MIN( d.remain_vol ) > 0 ) PreAgg
         ON t.orderid = PreAgg.orderID
   group by
      t.item

The "HAVING" clause is based on the minimum remaining value having something left. Ex, if the order was for 500 of something and gradually sold off to 400, 300, 200, 150, 76, the amount remaining of 76 would be one you are considering.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM