I'd begun posting something about this a few days ago when I thought I may've found a good solution thanks to reading a few of the suggested posts. Unfortunately we found out today that there are some data gaps. As far as I can tell it's owing to the query.
I'm attempting to grab up a year's worth of spend data for items. If they were purchased multiple times throughout the year then I group on them and SUM()
their quantities. The idea is to get a single row with all the needed info instead of 2, 10, or 100 based on each and every transaction that occurred.
The catch is that I don't want only a year's worth of data from today. I want year's worth that ends in the MAX(date)
for that company's batch of spend. In other words, maybe a product was last purchased 3 months ago, but their last known transaction/PO was from only 1 month ago. The dates related to our data then should be everything from 1 year prior to 1 month ago
Here is what I've currently got:
SELECT id, groupId, datePurchase, vendor, venItem, item, itemDesc, uom, SUM(qtyPurchase) AS `qtyPurchase`, SUM(extPrice) AS `totalPrice`, MAX(unitPrice) AS `maxPrice`, unitPrice, code, stripped_venitem, MAX(datePurchase) as `maxDate`, SUM(extPrice) as `extPrice`, PONum
FROM transactions t
WHERE (((qtyPurchase)!=0))
GROUP BY groupId, vendor, venItem, item, itemDesc, uom
HAVING datePurchase > (MAX(datePurchase) - INTERVAL 1 YEAR);
If I remove the HAVING
portion then the data in question shows up. It's not that it's date is less than the prescribed year, though. When I remove the HAVING
clause and find the product we're having a problem with, the maxDate
value is "2015-08-26"
This is all part of an import to Solr so I need to get it right within a single query, no extra tricks or processing. Thanks for any insight!
EDIT 1: It's worth noting, too, that I cannot have the HAVING datePurchase > (MAX(datePurchase) - INTERVAL 1 YEAR)
portion migrated to part of the WHERE
clause. If injected directly it throws an error of invalid use of group function
and if it is added as a subquery ( ...AND datePurchase > (SELECT MAX(datePurchase)...
) then it uses the value of MAX(datePurchase)
corresponding to the entire dataset, not the particular groupId's that it should be related to.
That is worth stressing as well: the max date should be related to each particular entity's (listed as groupId
) batch of transactions, not the entire table.
I'm guessing it was an arithmetic meets dates issue. I changed the HAVING datePurchase > (MAX(datePurchase) - INTERVAL 1 YEAR)
to be HAVING datePurchase > (MAX(datePurchase) - DATE_SUB(MAX(datePurchase), INTERVAL 1 YEAR))
and voila!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.