简体   繁体   中英

Oracle SQL - Counting entities per day from entity start date and end date

I'm starting with a table like so:

ENTITY_ID META_ATTRIB_1 META_ATTRIB_2 START_DATE END_DATE
1 FOO BAR 2020-01-01 2020-12-01

I'm would like to end up with a count of entities per day that fall within given sets of meta-attributes:

DAY META_ATTRIB_1 META_ATTRIB_2 COUNT
2020-01-01 FOO BAR 1
2020-01-02 FOO BAR 1
2020-01-03 FOO BAR 1

Right now I'm doing this by generating a sequence of dates from DUAL and joining the target table in via DAY BETWEEN START_DATE AND END_DATE and grouping by DAY, META_ATTRIB_1, META_ATTRIB_2 .

This method is running into performance problems. Is there a better method for splitting out each of these entity rows across the desired sequence of days and then aggregating it back for a by day count?

A typical approach uses a recursive to generate one row per day in each range, then aggregation:

with cte (meta_attrib_1, meta_attrib_2, dt, end_date) as (
    select meta_attrib_1, meta_attrib_2, start_date, end_date from mytable
    union all
    select meta_attrib_1, meta_attrib_2, dt + 1, end_date from cte where dt < end_date
)
select dt, meta_attrib_1, meta_attrib_2, count(*) as cnt
from cte
group by dt, meta_attrib_1, meta_attrib_2

This is pretty close to the logic that you described. You did not show your actual query so it is hard to tell whether this is a better solution than what you are doing currently.

You might find that a recursive CTE is faster:

with cte (day, meta_attrib1, meta_attrib2, end_date)
      select start_date, meta_attrib1, meta_attrib2, end_date
      from t
      union all
      select start_date + interval '1' day, meta_attrib1, meta_attrib2, end_date
      from cte
      where day < end_date
     )
select day, meta_attrib1, meta_attrib2, count(*)
from cte
group by day, meta_attrib1, meta_attrib2;

The advantage to a recursive CTE is that it "localizes" the expansion of the dates. Instead of relying on a non-equijoin, it simply churns out the additional dates.

This is still producing a separate row for each day for each of the original rows. That means that the aggregation could be the bottleneck.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM