简体   繁体   English

根据上一个和下一个字段的值计算字段值

[英]Calculating a fields value according to the values of the previous and next fields

For clarity assume that I have a table with a carID, a mileage and a date. 为了清楚起见,假设我有一个带有carID,里程和日期的表。 The dates are always months (eg 01/02/2015, 01/03/2015, ...). 日期始终为月(例如2015年2月1日,2015年3月1日等)。 Each carID has a row for each month, but not each row has values for the mileage field, some are NULL. 每个carID每月都有一行,但并非每一行都有Mileage字段的值,有些则为NULL。

Example table: 表格示例:

carID           mileage           date
-----------------------------------------
1               400            01/01/2015
2               NULL           01/02/2015
3               NULL           01/03/2015
4               1050           01/04/2015

If such a field is NULL I need to calculate what value it should have by looking at the previous and next values (these aren't necessarily the next or previous month, they can be months apart). 如果此类字段为NULL,则需要通过查看上一个和下一个值来计算其应具有的值(这些值不一定是下个月或上个月,它们可以相隔数月)。

I want to do this by taking the difference of the previous and next values, then calculate the time between them and make the value accordingly to the time. 我想通过取上一个和下一个值的差值来执行此操作,然后计算它们之间的时间,并使该值与时间相对应。 I have no idea however as how to do this. 但是我不知道该怎么做。

I have already used a bit of code to look at the next value before, it looks like this: 我之前已经使用了一些代码来查看下一个值,如下所示:

, carKMcombiDiffList as (
select ml.*,
       (ml.KM - mlprev.KM) as diff
from carKMcombilist ml outer apply
     (select top 1 ml2.*
      from carKMcombilist ml2
      where ml2.FK_CarID = ml.FK_CarID and
            ml2.beginmonth < ml.beginmonth
      order by ml2.beginmonth desc
     ) mlprev
)

What this does is check if the current value is larger then the previous value. 这是在检查当前值是否大于先前值。 I assume I can use this as well to check the previous one in my current problem, I just don't know how I can add the next one in it AND all the logic that I need to make the calculations. 我想我也可以用它来检查当前问题中的上一个,我只是不知道如何在其中添加下一个以及进行计算所需的所有逻辑。

The following query obtains the previous and next available mileages for a record. 以下查询获取记录的上一个和下一个可用里程。

with data as --test data
(
    select * from (VALUES
        (0, null, getdate()),
        (1, 400, '20150101'),
        (1, null, '20150201'),
        (1, null, '20150301'),
        (1, 1050, '20150401'),
        (2, 300, '20150101'),
        (2, null, '20150201'),
        (2, null, '20150301'),
        (2, 1235, '20150401'),
        (2, null, '20150501'),
        (2, 1450, '20150601'),
        (3, 200, '20150101'),
        (3, null, '20150201')
    ) as v(carId, mileage, [date])
    where v.carId != 0
)
-- replace 'data' with your table name
select  d.*, 
        (select top 1 mileage from data dprev where dprev.mileage is not null and dprev.carId = d.carId and dprev.[date] <= d.date order by dprev.[date] desc) as 'Prev available mileage',
        (select top 1 mileage from data dnext where dnext.mileage is not null and dnext.carId = d.carId and dnext.[date] >= d.date order by dnext.[date] asc) as 'Next available mileage'
from    data d

Note that these columns can still be null if there is no data available before/after a specific date. 请注意,如果在特定日期之前/之后没有可用数据,则这些列仍可以为null

From here it's up to you on how you use these values. 从这里开始,您将如何使用这些值。 Probably you want to interpolate values for records where mileage is missing. 可能您想插入缺少mileage记录的值。

Edit 编辑

In order to interpolate the values for missing mileages I had to compute three auxiliary columns: 为了对缺失的里程值进行插值,我必须计算三个辅助列:

ri - index of record in a continuous group where mileage is missing ri缺少里程的连续组中的记录索引
gi - index of a continuous group where mileage is missing per car gi每辆车缺少里程的连续组的索引
gc - count of records per continuous group where mileage is missing gc缺少里程的每个连续组的记录数

The limit columns from the query above where renamed to 上面查询中的限制列已重命名为
pa (Previous Available) and pa (以前可用)和
na (Next Available). na (下一个可用)。

The query is not compact and I am sure it can be improved but the good part of the cascading CTEs is that you can easily check intermediary results and understand each step. 该查询不是紧凑的,我敢肯定它可以改进,但是级联CTE的优点在于您可以轻松地检查中间结果并了解每个步骤。

SQL Fiddle: SO 29363187 SQL小提琴: SO 29363187

with data as --test data
(
    select * from (VALUES
        (0, null, getdate()),
        (1, 400, '20150101'),
        (1, null, '20150201'),
        (1, null, '20150301'),
        (1, 1050, '20150401'),
        (2, 300, '20150101'),
        (2, null, '20150201'),
        (2, null, '20150301'),
        (2, 1235, '20150401'),
        (2, null, '20150501'),
        (2, 1450, '20150601'),
        (3, 200, '20150101'),
        (3, null, '20150201')
    ) as v(carId, mileage, [date])
    where v.carId != 0
),
-- replace 'data' with your table name
limits AS
(
    select  d.*, 
            (select top 1 mileage from data dprev where dprev.mileage is not null and dprev.carId = d.carId and dprev.[date] <= d.date order by dprev.[date] desc) as pa,
            (select top 1 mileage from data dnext where dnext.mileage is not null and dnext.carId = d.carId and dnext.[date] >= d.date order by dnext.[date] asc) as na
    from    data d
),
t1 as
(
    SELECT l.*,
           case when mileage is not null 
                then null 
                else row_number() over (partition by l.carId, l.pa, l.na  order by  l.carId, l.[date])
           end as ri,   -- index of record in a continuous group where mileage is missing
           case when mileage is not null 
                then null 
                else dense_rank() over (partition by carId order by  l.carId, l.pa, l.na)
           end as gi    -- index of  a continuous group where mileage is missing per car
    from limits l
),
t2 as
(
    select  *,
            (select count(*) from t1 tm where tm.carId = t.carId and tm.gi = t.gi)  gc  --count of records per continuous group where mileage is missing
    FROM    t1 t
)
select  *,
        case when mileage is NULL
            then pa + (na - pa) / (gc + 1.0) * ri   -- also converts from integer to decimal
            else NULL
        end as 'Interpolated value' 
from    t2
order by carId, [date]

Assumption: CarID and date are always a unique combination 假设:CarID和日期始终是唯一的组合

This is what i came up with: 这是我想出的:

select with_dates.*,
       prev_mileage.mileage as prev_mileage,
       next_mileage.mileage as next_mileage,
       next_mileage.mileage - prev_mileage.mileage as mileage_delta,
       datediff(month,prev_d,next_d) as month_delta,
       (next_mileage.mileage - prev_mileage.mileage)/datediff(month,prev_d,next_d)*datediff(month,prev_d,with_dates.d) + prev_mileage.mileage as estimated_mileage
  from (select *,
          (select top 1 d
             from mileage as prev
            where carid = c.carid
              and prev.d < c.d
              and prev.mileage is not null
         order by d desc ) as prev_d,
           (select top 1 d
             from mileage as next_rec
            where carid = c.carid
              and next_rec.d > c.d
              and next_rec.mileage is not null
         order by d asc) as next_d
          from mileage as c
         where mileage is null) as with_dates
  join mileage as prev_mileage
    on     prev_mileage.carid = with_dates.carid
       and prev_mileage.d = with_dates.prev_d
  join mileage as next_mileage
    on     next_mileage.carid = with_dates.carid
       and next_mileage.d = with_dates.next_d

Logic: First, for every mileage is null record i select the previous and next date where mileage is not null . 逻辑:首先,对于每一个mileage is null记录我选择了一个和下一个日期,其中mileage is not null After this i just join the rows based on carid and date and do some simple math to approximate. 在此之后,我只是根据龋齿和日期加入行,并做一些简单的数学运算即可近似。

Hope this helps, it was quite fun. 希望这会有所帮助,这很有趣。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM