简体   繁体   中英

Lossless / Fair Share Division in SQL

I have a couple of situations where I want use PostgreSQL to be able to divide an integer by another integer and accomplish three things:

  1. Have all resultings numbers be integers
  2. Have the sum of all resulting numbers equal the original number (a lossless method)
  3. Employ a "fair share" spread, to avoid the "hockey stick" effect of dumping the remainder into the last bucket

The specific issue I am dealing with now is I have monthly quantities such as this:

Month      Part       Qty
---------  ---------  -----
1/1/2016   ABC        10
2/1/2016   ABC         9

This is about as far as I have gotten successfully, which has taken care of my first two goals, but not my third:

with weekly_buckets as (
  select generate_series (0, 3) as week_number
)
select
  p.month_date + 7 * w.week_number as week_date,
  p.part_number,
  case w.week_number
    when 0 then p.qty / 4
    when 1 then p.qty / 4
    when 2 then p.qty / 4
    when 3 then p.qty - 3 * (p.qty / 4)
  end as qty
from
  part_demand p
  cross join weekly_buckets w

Resulting in:

Week       Part       Qty
---------  ---------  -----
1/1/2016   ABC        2
1/8/2016   ABC        2
1/15/2016  ABC        2
1/22/2016  ABC        4

Hence the hockey-stick effect of 4 in the final week. I could use a ceiling instead of a floor, but that would be even worse at 3, 3, 3, 1.

Ideally, the spread would look like 2, 3, 2, 3 or 3, 2, 3, 2. Either is acceptable. This is what I mean by fair-share spread. If there is a more mathematically correct term for this, please enlighten me.

For reference, these quantities represent monthly forecasts, and we are trying to determine if actual orders are on target, ahead of or behind forecasts.

Also, If I can solve this, I can leverage the logic to do the same thing to create monthly buckets for annual quantities.

As a final example, if I saw something like this:

Month      Part       Qty
---------  ---------  -----
1/1/2016   ABC        1

Ideally, the only week with a non-zero value would be week 2 or week 3. It really doesn't matter, but in case you were wondering how low values would be handled, this is what I had in mind.

You can use window functions to allocate the values. Here is your first example:

with b as (
      select generate_series(1, 4) as i, 10 as amt
     ),
     bb as (
      select b.*,
             count(*) over () as numbuckets,
             row_number() over (order by i) as rn,
         amt % (count(*) over () ) as remainder
      from b
     )
select bb.*,
       (amt / numbuckets +
        (case when rn <= remainder then 1 else 0 end)
       ) as partitioned
from bb;

The idea is to allocate amt / numbuckets to each bucket. Then to spread the remainder through the buckets using row_number() . If you want the values spread randomly, then use order by random() for the definition of rn .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM