简体   繁体   中英

SQL: How to fill empty cells with previous row value on basis of condition?

SQL: Fill empty cells with previous row value on basis of condition?

Please treat this as High Priority Request..help needed

Requesting a high rep user link it ( http://i.imgur.com/P4UOiMz.jpg )

I need to produce the column "OXY_ID_NEW" in the following table using SQL. Is this possible in SQL 2008R2 or SQL 2012 or Amazon REDSHIFT?

SQL TABLE image( http://i.imgur.com/P4UOiMz.jpg )

Basically, I wanted to forward fill empty "OXY_ID" cells with last known Oxy_id for that ID, as shown in 'OXY_ID_NEW' column.

maybe something like

coalesce(lag(oxy_id) over (partition by id order by number), oxy_id)

..assuming the id, number cols actually increase .. in the screenshot it looks like they repeat in which case you'll need to provide the whole table definition.

The LAG example as given by gordy is the simplest as long as you have it. Note though that you cannot use it directly to update your table. Windowed functions can only appear in SELECT or ORDER BY clause, so you would need a temporary table.

For older versions you need a cursor. Something like:

declare @demo table
(
id varchar (10),
number int,
oxy_id varchar(2)
)

INSERT INTO @demo VALUES ('308_2123', 36, 'ZY')
INSERT INTO @demo VALUES ('308_2123', 36, NULL)
INSERT INTO @demo VALUES ('308_2123', 37, NULL)
INSERT INTO @demo VALUES ('308_2123', 37, NULL)
INSERT INTO @demo VALUES ('308_2123', 38, 'WY')
INSERT INTO @demo VALUES ('308_2123', 38, 'WY')
INSERT INTO @demo VALUES ('308_2123', 38, NULL)
INSERT INTO @demo VALUES ('308_2123', 39, NULL)
INSERT INTO @demo VALUES ('309_5647', 30, 'AB')
INSERT INTO @demo VALUES ('309_5647', 30, NULL)
INSERT INTO @demo VALUES ('309_5647', 31, NULL)
INSERT INTO @demo VALUES ('309_5647', 32, 'BC')
INSERT INTO @demo VALUES ('310_8897', 20, 'CD')
INSERT INTO @demo VALUES ('310_8897', 21, 'DC')
INSERT INTO @demo VALUES ('310_8897', 22, NULL)
INSERT INTO @demo VALUES ('310_8897', 23, NULL)
INSERT INTO @demo VALUES ('310_8897', 23, NULL)
INSERT INTO @demo VALUES ('311_6789', 1, NULL)
INSERT INTO @demo VALUES ('311_6789', 1, NULL)
INSERT INTO @demo VALUES ('311_6789', 2, 'EF')
INSERT INTO @demo VALUES ('311_6789', 3, 'GH')
INSERT INTO @demo VALUES ('311_6789', 3, NULL)
INSERT INTO @demo VALUES ('312_9874', 1, 'HK')
INSERT INTO @demo VALUES ('312_9874', 1, 'KY')
INSERT INTO @demo VALUES ('312_9874', 1, NULL)
INSERT INTO @demo VALUES ('312_9874', 1, 'YY')

DECLARE @id varchar(10)
DECLARE @oxy_ID varchar(2)
declare @prevOxyID varchar(10) = NULL
declare @number int
DECLARE @previd varchar(10) = NULL

DECLARE cur CURSOR FOR
(SELECT d.id, d.number, d.oxy_id FROM @demo d) 

OPEN cur

FETCH NEXT FROM cur into 
    @id, @number, @oxy_id

WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @oxy_id IS NULL
            BEGIN
                if @prevOxyID IS NOT NULL 
                    BEGIN
                        IF @id = @previd
                            BEGIN
                                UPDATE @demo SET oxy_id = @prevOxyID 
                                WHERE id = @id AND number = @number AND oxy_id IS NULL
                            END
                        ELSE
                            BEGIN
                                SET @prevOxyID = NULL
                            END
                    END
                SET @previd = @id
            END
        ELSE
            BEGIN
                SET @previd = @id
                SET @prevOxyID = @oxy_ID
            END
        FETCH NEXT FROM cur into 
        @id, @number, @oxy_id
    END

close cur
deallocate cur

SELECT * FROM @demo

Please note that you cannot use order by in a cursor. The data must already be in the order as shown on your image. If the data in the table is not in this order then again, you will need to use a temporary table with the records inserted in the right order, and then perform the cursor on the temporary table, and finally update the original table from the temporary one.

EDIT

OK So no cursor version. As mentioned by gordy, the problem with LAG is the repeated numbers. This same problem restricts the use of UPDATE, since there is no unique identifier for a row. Instead I have to insert the results into a temporary table, delete the originals and then re-insert from temp. If you do in fact have a unique key, then please replace this delete and insert with an UPDATE. The following solution, whilst a bit long-winded, does get around the problems, and according to my research should work on Amazon Redshift, but I do not have access to test. I will not repeat the inserts, please copy from above.

declare @demo table
(
id varchar (10),
number int,
oxy_id varchar(2)
)

create table allrownums
(
id varchar (10),
number int,
oxy_id varchar(2),
rownum int
)

INSERT INTO allrownums
SELECT id, number, oxy_id, ROW_NUMBER() OVER (ORDER BY id, number) AS rownum
FROM @demo;

create table allnotnullrows
(
id varchar (10),
number int,
oxy_id varchar(2),
rownum int
)

INSERT INTO allnotnullrows 
SELECT * FROM allrownums 
WHERE oxy_id IS NOT NULL

create table maxrownums
(
id varchar (10),
rownum int,
maxrownum int
)
INSERT INTO maxrownums
SELECT a.id, a.rownum, Max(n.rownum)
FROM allrownums a INNER JOIN allnotnullrows n
ON n.id = a.id WHERE a.rownum >= n.rownum
GROUP BY a.id, a.rownum 

create table tempresults
(
id varchar (10),
number int,
oxy_id varchar(2)
)  
INSERT INTO tempresults
SELECT a.id, a.number, coalesce(a.oxy_id, n.oxy_id) as oxy_id
FROM allrownums a
LEFT JOIN maxrownums m
ON m.rownum = a.rownum
LEFT JOIN  allnotnullrows n
ON a.id = n.id 
and n.rownum = m.maxrownum

DELETE FROM @demo;

INSERT INTO @demo SELECT * FROM tempresults;

DROP TABLE tempresults;
DROP TABLE allrownums;
DROP TABLE allnotnullrows;
DROP TABLE maxrownums;

SELECT * FROM @demo;

Should work in SQL 2008 (I have no way to test, but CROSS APPLY works in SQL 2008)

CREATE TABLE #table1( ID INT, Number INT, OXY_ID VARCHAR( 2 ))
INSERT INTO #table1
VALUES
( 1, 23, 'AD' ),
( 2, 23, 'XY' ),
( 3, 23, '' ),
( 4, 23, '' ),
( 5, 23, 'MY' ),
( 6, 23, '' ),
( 7, 23, 'ZY' )

CREATE INDEX IX_table1__ID ON #table1( ID, OXY_ID )


SELECT a.*, c.OXY_ID AS OXY_ID_New
FROM #table1 AS a
    CROSS APPLY
            ( SELECT TOP 1 ID, OXY_ID
            FROM #table1 AS b 
            WHERE OXY_ID <> '' AND a.ID >= b.ID
            ORDER BY ID DESC ) AS c

Comments:

Should be a lot faster than a cursor.

LAG solution is more elegant compared to this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM