简体   繁体   中英

Looking for DB2 SQL query

I am not very familiar with SQL and I hope some expert here can show me suitable and efficient query for what I want to achieve. I am using DB2 by the way.

Below is a screenshot of a sample data. What I need is for a given year, select the record with distinct ID1+ID2+Name columns and maximum (most recent) effective date (in YYYYMMDD format, stored as integer), with the above year being in between YearFrom and YearTo range.

在此处输入图片说明

FOr anyone that cant see a screenshot:

NAME     YearFrom     YearTo    ID1    ID2    EffDate
item1    2002         2005      AB     10     20091201
item1    2009         2013      AB     10     20100301
item2    2001         2004      XX     20     20050103 
item2    2002         2009      XX     20     20060710 
item2    2007         2013      XX     20     20090912 
item3    2005         2010      YY     30     20110304 

I hope I explained it well. For example if user is looking for available items in year 2011, item1 (with eff. date 20100301) and item 2 (with eff. date 20090912) will be returned.

If someone is looking for items available in year 2008: item2 (with eff. date 20090912) and item 3 will be returned. Item 1 will not be returned in this case because the most recent record for item 1 has range of 2009-2013.

I think I have the first part of the query right, but I dont know how to select the valid records from that results based on the year in one query.

select name,id1,id2,max(effdate) 
from [table] 
group by name,id1,id2

Any help would be much appreciated.

you can go with below qyery for this type of output --

-- you want to check with the row where effective date is the maximum for the item name column then you can take only those records and then we can put year condition on those records.

SELECT NAME, Id1, Id2, Effdate
  FROM Table_Name t_1
 WHERE Effdate =
       (SELECT (t_2.Effdate) 
        FROM Table_Name t_2 
        WHERE t_2.NAME = t_1.NAME
        and t_2.id1 = t_1.id1
        and t_2.id2 = t_1.id2
        GROUP BY t_2.name,t_2.id1,t_2.id2)
   AND Your_Year_Variable_Value BETWEEN t_1.Yearfrom AND t_1.Yearto

It's not clear whether these two statements are in conflict. I think they are in conflict, and I'm going with statement 1 in the code below.

[1.] What I need is for a given year, select the record with distinct ID1+ID2+Name columns and maximum (most recent) effective date (in YYYYMMDD format, stored as integer), with the above year being in between YearFrom and YearTo range.

[2.] Item 1 will not be returned in this case because the most recent record for item 1 has range of 2009-2013.

I would say that item 1 would not be returned, because it has no information for year 2008. If it did have information for 2008, it should be returned per statement 1 above, regardless of whether there happened to be more recent data.


If you expand your table so each year appears in a row by itself, rather than being implied by a range like 2002-2005, it's pretty simple. The query below is in PostgreSQL; you should only have to replace the first common table expression with a DB2 equivalent to generate a table of numbers (or use an actual table of numbers), and fixup the CTE syntax. ( DB2's CTE syntax is unique.)

with years as (
  select generate_series(2000, 2020) as year
),
expanded_table1 as (
  select id1, id2, name, year, yearfrom, yearto, effdate
  from Table1
  inner join years on years.year between YearFrom and YearTo
)
select id1, id2, name, year, max(effdate)
from expanded_table1
where year = 2008
group by id1, id2, name, year

Explanation

This query, the first CTE, generates a series of integers that represent all the years we might be interested in. A more robust solution might select the minimum and maximum years for the number generator from your table instead of using integer literals.

select generate_series(2000, 2020) as year;

YEAR
--
2000
2001
2002
...
2020

By joining that table with your table, we can expand the ranges into rows.

with years as (
  select generate_series(2000, 2020) as year
)
select id1, id2, name, year, yearfrom, yearto, effdate
from Table1
inner join years on years.year between YearFrom and YearTo
order by id1, id2, name, year;

ID1    ID2    NAME      YEAR     YEARFROM  YEARTO   EFFDATE
--
AB     10     item1     2002     2002      2005     20091201
AB     10     item1     2003     2002      2005     20091201
AB     10     item1     2004     2002      2005     20091201
AB     10     item1     2005     2002      2005     20091201
...

Having prepared the foundation this way, the query to find the maximum effective date for each distinct combination of id1, id2, name, for a given year is just a simple GROUP BY with a WHERE clause.

with years as (
  select generate_series(2000, 2020) as year
),
expanded_table1 as (
  select id1, id2, name, year, yearfrom, yearto, effdate
  from Table1
  inner join years on years.year between YearFrom and YearTo
)
select id1, id2, name, year, max(effdate)
from expanded_table1
where year = 2011
group by id1, id2, name, year

ID1    ID2    NAME      YEAR     MAX
--
AB     10     item1     2011     20100301
XX     20     item2     2011     20090912

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM