简体   繁体   中英

Getting the most recent data for each value in another column

I am trying to get the data associated with the most recent curve_date corresponding to each tenor_years value and am using the query below to do this. However, I am not getting the data as I would like it to be.

SELECT
  tenor_years,
  yield_pct,
  MAX(curve_date) AS "MostRecentDate"
FROM yc_node_hist
where fk_yc_update = 12
GROUP BY tenor_years, yield_pct
order by tenor_years

SELECT * FROM yc_node_hist where fk_yc_update = 12 gives the data below:

id      fk_yc_update curve_date  tenor_years    yield_pct     
353443  12           2013-07-26       1           0.1436
353444  12           2013-07-29       1           0.1389
353445  12           2013-07-30       1           0.133

The data comes out as follows:

tenor_years   yield_pct   curve_date
1              0.0828     2014-05-14
1              0.0832     2014-05-19

I want to get something like:

tenor_years   yield_pct   curve_date
1              0.0828     2014-05-14
2              0.3232     2015-06-17
..
30

Thank You

You have to remove the yield_pct from the group by:

SELECT
  tenor_years,
  MAX(curve_date) AS "MostRecentDate"
FROM yc_node_hist
where fk_yc_update = 12
GROUP BY tenor_years;

And then join back on itself:

SELECT a.tenor_years, a.curve_date,a.yield_pct
FROM yc_node_hist a
INNER JOIN (
  SELECT
    tenor_years,
    MAX(curve_date) AS "MostRecentDate"
  FROM yc_node_hist
  where fk_yc_update = 12) b 
ON a.tenor_years=b.tenor_years AND a.curve_date=b.MostRecentDate
ORDER BY tenor_years ASC;
GROUP BY tenor_years;

SQL Server offers PARTITION / OVER functionality for situations like that.

SELECT tenor_years,
        yield_pct,
        MostRecentDate
 FROM (
    SELECT
        tenor_years,
        yield_pct,
        curve_date AS "MostRecentDate",
        RANK() OVER (PARTITION BY tenor_years ORDER BY curve_date DESC) N
    FROM yc_node_hist
    where fk_yc_update = 12
)M
WHERE N = 1
ORDER BY tenor_years

This produces a fast query with a projection, avoiding the need to join back to the original.

Demo.

with x as
(SELECT
tenor_years,
--yield_pct,
MAX(curve_date) AS "MostRecentDate"
FROM yc_node_hist
where fk_yc_update = 12
GROUP BY tenor_years --, yield_pct
 )
select x.tenor_years, y.yield_pct, y.mostrecentdate
from x join yc_node_hist y on x.mostrecentdate = y.curvedate

CTE is one way to achieve it and the grouping on yield_pct has to be removed.

SELECT tenor_years,
       yield_pct,
       curve_date
FROM (
        SELECT tenor_years,
               yield_pct,
               curve_date,
               MAX(curve_date) OVER (PARTITION BY tenor_years) max_date
        FROM yc_node_hist
        WHERE fk_yc_update = 12
      ) [anon]
WHERE curve_date = max_date
ORDER BY tenor_years;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM