简体   繁体   中英

MySQL select AVG, ORDER BY, GROUP BY & LIMIT

The bellow statement does not work but i cant seem to figure out why

select AVG(delay_in_seconds) from A_TABLE ORDER by created_at DESC GROUP BY row_type limit 1000;

I want to get the avg's of the most recent 1000 rows for each row_type. created_at is of type DATETIME and row_type is of type VARCHAR

If you only want the 1000 most recent rows, regardless of row_type, and then get the average of delay_in_seconds for each row_type, that's a fairly straightforward query. For example:

SELECT t.row_type
     , AVG(t.delay_in_seconds)
  FROM (
         SELECT r.row_type
              , r.delay_in_seconds
           FROM A_table r
          ORDER BY r.created_at DESC
          LIMIT 1000
       ) t
 GROUP BY t.row_type

I suspect, however, that this query does not satisfy the requirements that were specified. (I know it doesn't satisfy what I understood as the specification.)

If what we want is the average of the most recent 1000 rows for each row_type, that would also be fairly straightforward... if we were using a database that supported analytic functions.

Unfortunately, MySQL doesn't provide support for analytic functions. But it is possible to emulate one in MySQL, but the syntax is a bit involved, and it is dependent on behavior that is not guaranteed.

As an example:

SELECT s.row_type
     , AVG(s.delay_in_seconds)
  FROM ( 
         SELECT @row_ := IF(@prev_row_type = t.row_type, @row_ + 1, 1) AS row_
              , @prev_row_type := t.row_type AS row_type
              , t.delay_in_seconds
           FROM A_table t
          CROSS
           JOIN (SELECT @prev_row_type := NULL, @row_ := NULL) i
          ORDER BY t.row_type DESC, t.created_at DESC
       ) s
 WHERE s.row_ <= 1000
 GROUP
    BY s.row_type

NOTES:

The inline view query is going to be expensive for large sets. What that's effectively doing is assigning a row number to each row. The "order by" is sorting the rows in descending sequence by created_at , what we want is for the most recent row to be assigned a value of 1, the next most recent 2, etc. This numbering of rows will be repeated for each distinct value of row_type .

For performance, we'd want a suitable index with leading columns (row_type,created_at,delay_seconds) to avoid an expensive "Using filesort" operation. We need at least those first two columns for that, including the delay_seconds makes it a covering index (the query can be satisfied entirely from the index.)

The outer query then runs against the resultset returned from the view query (a "derived table"). The predicate in the WHERE filters out all rows that were assigned a row number greater than 1000, the rest is a straighforward GROUP BY and and AVG aggregate.

A LIMIT clause is entirely unnecessary. It may be possible to incorporate some additional predicates for some additional performance enhancement... like, what if we specified the most recent 1000 rows, but only that were create_at within the past 30 or 90 days?

(I'm not entirely sure this answers the question that OP was asking. What this answers is: Is there a query that can return the specified resultset, making use of AVG aggregate and GROUP BY , ORDER BY and LIMIT clauses.)

NB This query is dependent on a behavior of MySQL user-defined variables which is not guaranteed.


The query above shows one approach, but there is also another approach. It's possible to use a "join" operation (of A_table with A_table) to get a row number assigned (getting a COUNT of the number of rows that are "more recent" than each row. With large sets, however, that can produce a humongous intermediate result, if we aren't careful to limit it.

Write the ORDER BY at the last of the statement.

SELECT AVG(delay_in_seconds) from A_TABLE GROUP BY row_type ORDER by created_at DESC  limit 1000;

read mysql dev site for details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM