Here is the simplified table:
id - company_id - report_year - code
1 - 123456 - 2013 - ASD
2 - 123456 - 2013 - SDF
3 - 123456 - 2012 - ASD
4 - 123456 - 2012 - SDF
I would like to get all codes for the highest report_year available for the specified company_id.
So I should get:
1 - 123456 - 2013 - ASD
2 - 123456 - 2013 - SDF
But I can not hard code WHERE year = 2013
, because for some company latest report year may be 2012 or 2009 for example. So I need to get data based on the latest year available.
So far I have query like this:
SELECT id, company_id, report_year, code,
FROM `my_table`
WHERE company_id= 123456
I have tried with some mixtures of group by and max() but I couldn't get what I need, this is the first time I am facing such a request, its confusing.
Any ideas ? I am using mysql.
Use a correlated sub-query to find latest year for a company:
SELECT id, company_id, report_year, code,
FROM `my_table` t1
WHERE company_id = 123456
AND report_year = (select max(report_year)
from `my_table` t2
where t1.company_id = t2.company_id)
You could do this using a join on the same table which returns the max year per company like so:
select my_table.id, my_table.company_id, my_table.report_year, my_table.code
from my_table
inner join (
select max(report_year) as maxYear, company_id
from my_table
group by company_id
) maxYear ON my_table.report_year = maxYear.maxYear
and my_table.company_id = maxYear.company_id
To limit this to a specific company, just add your where
clause back:
select my_table.id, my_table.company_id, my_table.report_year, my_table.code
from my_table
inner join (
select max(report_year) as maxYear, company_id
from my_table
where my_table.company_id= 123456
group by company_id
) maxYear ON my_table.report_year = maxYear.maxYear
and my_table.company_id = maxYear.company_id
Often, an anti-join yields better performance than using subqueries:
SELECT t1.id, t1.company_id, t1.report_year, t1.code
FROM `my_table` t1
LEFT JOIN `my_table` t2
ON t2.company_id = t1.company_id AND t2.report_year > t1.report_year
WHERE t1.company_id = 123456 AND t2.report_year IS NULL
For best performance, ensure you have a multi-column index on (company_id, report_year).
You can read more about this technique in the book SQL Antipatterns , which is where I learned it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.