简体   繁体   中英

How to replace NULL in a result set with the last NOT NULL value in the same column?

A colleague of mine has a problem with a sql query:-

Take the following as an example, two temp tables:-

select 'John' as name,10 as value into #names
UNION ALL SELECT 'Abid',20 
UNION ALL SELECT 'Alyn',30 
UNION ALL SELECT 'Dave',15;

select 'John' as name,'SQL Expert' as job into #jobs
UNION ALL SELECT 'Alyn','Driver' 
UNION ALL SELECT 'Abid','Case Statement';

We run the following query on the tables to give us a joined resultset:-

select #names.name, #names.value, #jobs.job
FROM #names left outer join #jobs
on #names.name = #jobs.name

name    value    job
John    10       SQL Expert
Abid    20       Case Statement
Alyn    30       Driver
Dave    15       NULL

As 'Dave' does not exist in the #jobs table, he is given a NULL value as expected.

My colleague wants to modify the query so each NULL value is given the same value as the previous entry.

So the above would be:-

name    value    job
John    10       SQL Expert
Abid    20       Case Statement
Alyn    30       Driver
Dave    15       Driver

Note that Dave is now a 'Driver'

There may be more than one NULL value in sequence,

name    value    job
John    10       SQL Expert
Abid    20       Case Statement
Alyn    30       Driver
Dave    15       NULL
Joe     15       NULL
Pete    15       NULL

In this case Dave, Joe and Pete should all be 'Driver', as 'Driver' is the last non null entry.

There are probably better ways to do this. Here is one of the ways I could achieve the result using Common Table Expressions (CTE) and using that output to perform a OUTER APPLY to find the previous persion's job. The query here uses id to sort the records and then determines what the previous person's job was. You need at least one criteria to sort the records because data in tables are considered to be unordered sets.

Also, the assumption is that the first person in the sequence should have a job. If the first person doesn't have a job, then there is no value to pick from.

Click here to view the demo in SQL Fiddle.

Click here to view another demo in SQL Fiddle with second data set.

Script :

 CREATE TABLE names
    (
            id      INT         NOT NULL IDENTITY
      ,     name    VARCHAR(20) NOT NULL
      ,     value   INT         NOT NULL
    );

    CREATE TABLE jobs
    (
            id  INT         NOT NULL
      ,     job VARCHAR(20) NOT NULL
    );

    INSERT INTO names (name, value) VALUES
      ('John', 10),
      ('Abid', 20),
      ('Alyn', 30),
      ('Dave', 40),
      ('Jill', 50),
      ('Jane', 60),
      ('Steve', 70);

    INSERT INTO jobs (id, job) VALUES
      (1, 'SQL Expert'),
      (2, 'Driver' ),
      (5, 'Engineer'),
      (6, 'Barrista');

    ;WITH empjobs AS
    (
        SELECT
        TOP 100 PERCENT n.id
                    ,   n.name
                    ,   n.value
                    ,   job
        FROM            names n 
        LEFT OUTER JOIN jobs j
        on              j.id = n.id
        ORDER BY        n.id
    ) 
    SELECT      e1.id
            ,   e1.name
            ,   e1.value
            ,   COALESCE(e1.job , e2.job) job FROM empjobs e1
    OUTER APPLY (
                  SELECT 
                  TOP 1     job 
                  FROM      empjobs     e2
                  WHERE     e2.id   < e1.id
                  AND       e2.job  IS NOT NULL
                  ORDER BY  e2.id   DESC
                ) e2;

Output :

ID  NAME    VALUE  JOB
--- ------  -----  -------------
1   John      10   SQL Expert
2   Abid      20   Driver
3   Alyn      30   Driver
4   Dave      40   Driver
5   Jill      50   Engineer
6   Jane      60   Barrista
7   Steve     70   Barrista

What do you mean by "last" non-null entry? You need a well-defined ordering for "last" to have a consistent meaning. Here's a query with data definitions that uses the "value" column to define last, and that might be close to what you want.

CREATE TABLE #names
    (
            id      INT         NOT NULL IDENTITY
      ,     name    VARCHAR(20) NOT NULL
      ,     value   INT         NOT NULL PRIMARY KEY
    );

    CREATE TABLE #jobs
    (
            name  VARCHAR(20)         NOT NULL
      ,     job VARCHAR(20) NOT NULL
    );

    INSERT INTO #names (name, value) VALUES
      ('John', 10),
      ('Abid', 20),
      ('Alyn', 30),
      ('Dave', 40),
      ('Jill', 50),
      ('Jane', 60),
      ('Steve', 70);

    INSERT INTO #jobs (name, job) VALUES
      ('John', 'SQL Expert'),
      ('Abid', 'Driver' ),
      ('Alyn', 'Engineer'),
      ('Dave', 'Barrista');

with Partial as (
  select
  #names.name,
  #names.value,
  #jobs.job as job
  FROM #names left outer join #jobs
  on #names.name = #jobs.name
)
  select
    name,
    value,
    (
      select top 1 job
      from Partial as P
      where job is not null
      and P.value <= Partial.value
      order by value desc
    )
  from Partial;

It might be more efficient to insert the data, then update.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM