简体   繁体   中英

How Do I Select the Entire Record When Using MAX() With GROUP BY

Using MYSQL I would like to refactor the following SELECT statement to return the :

> SELECT id, invoice, invoice_date
  FROM invoice_items
  WHERE lot = 1047

id    invoice_id   invoice_date
-----------------------------------
3235    1047         2009-12-15 11:40:00
3295    1047         2009-12-15 16:00:00
3311    1047         2009-12-15 09:30:00
3340    1047         2009-12-15 13:50:00

Using the MAX() aggregate function and the GROUP BY clause gets me part of the way there:

> SELECT id, invoice_id, max(invoice_date)
  FROM invoice_items
  WHERE invoice_id = 1047
  GROUP BY invoice_id


id    invoice_id   invoice_date
-----------------------------------
3235    1047         2009-12-15 16:00:00

Notice that the query appears to get the MAX(invoice_date) correctly, but the id returned (3235) is not the id of the record containing the MAX(invoice_date) (3295) it is the id of the first record in the initial query.

How do I refactor this query to give me the the

The solution must use the GROUP BY clause, because I need to get newest invoice_date for each invoice.

This is the often-repeated "greatest-n-per-group" problem.

Here's how I would solve it in MySQL:

SELECT i1.*
FROM invoice_items i1
LEFT OUTER JOIN invoice_items i2
  ON (i1.invoice_id = i2.invoice_id AND i1.invoice_date < i2.invoice_date)
WHERE i2.invoice_id IS NULL;

Explanation: for each row i1 , try to find a row i2 with the same invoice_id and a greater date. If none are found (ie i2 is all nulls because of the outer join), then i1 must be the row with the greatest date for its invoice_id .

This solution using join tends to work better for MySQL, which is weak when optimizing both GROUP BY and subqueries.

I'm assuming that since the table name is invoice_items that there would be multiple rows for a given invoice, so you should probably use something like this:

SELECT * FROM invoice_items 
WHERE invoice_date IN (SELECT MAX(invoice_date) FROM invoice_items)

If you aren't concerned about two records having the same invoice date, you could just do this:

SELECT * FROM invoice_items
ORDER BY invoice_date DESC
LIMIT 1

pretty much exactly the way you'd say it in English

"Get me the invoice with the latest Invoice Date"

Select * From invoice_items
Where invoice_date =
   (Select Max(invoice_date)
    From invoice_items)

But something is wrong in your schema I think. Since there are multiple rows with the same Invoice_Id, this looks like an Invoice Details or Invoice line items table, (not an Invoice Table). And if so, how can each line item within the same invoice have different InvoiceDates"? If these are different, then they are not invoice dates, they are invoice detail dates, (whatever that means) and should be labeled as so..

Here's my attempt:

SELECT t1.*
FROM INVOICE_ITEMS t1,
   (SELECT INVOICE_ID, MAX(INVOICE_DATE) as invoice_date2
      FROM INVOICE_ITEMS
     GROUP BY INVOICE_ID) t2
WHERE t1.invoice_id = t2.invoice_id
AND t1.invoice_date = t2.invoice_date2
SELECT * 
FROM invoice_items
WHERE lot = 1047
ORDER BY invoice_date desc LIMIT 1

or better if your id is your primary key and always growing

SELECT * 
FROM invoice_items
WHERE lot = 1047
ORDER BY id desc LIMIT 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM