简体   繁体   中英

select last non-null value and append it to another column BigQuery/PYTHON

I have a table in BQ that looks like this:

Row     Field  DateTime  
1        one   10:00 AM    
2        null  10:05 AM     
3        null  10:10 AM    
4        one   10:30 AM    
5        null  11:00 AM    
6        two   11:15 AM    
7        two   11:30 AM 
8        null  11:35 AM
9        null  11:40 AM
10       null  11:50 AM
11       null  12:00 AM
12       null  12:15 AM
13       two   12:30 AM
14       null  12:15 AM
15       null  12:25 AM
16       null  12:35 AM
17       three 12:55 AM     

I want to create another column called prevField and fill it out with the last Field value that is not null, when the first and last entry around the null are the same. When the first and last entry around null are different, it should remain null. The result would look like the following:

  Row     Field    DateTime  prevField
    1        one   10:00 AM   null 
    2        null  10:05 AM   one    
    3        null  10:10 AM   one   
    4        one   10:30 AM   one
    5        null  11:00 AM   null
    6        two   11:15 AM   two
    7        two   11:30 AM   two
    8        null  11:35 AM   two
    9        null  11:40 AM   two
    10       null  11:50 AM   two
    11       null  12:00 AM   two
    12       null  12:15 AM   two
    13       two   12:30 AM   two 
    14       null  12:15 AM   null
    15       null  12:15 AM   null
    16       null  12:15 AM   null
    17       three 12:15 AM   three  

So far i tried the following code variations for first part of the question (fill out prevField with the last Field value that is not null, when the first and last entry around the null are the same) but without success.

select Field, Datetime,

(1)--case when FieldName is null then LAG(FieldName) over (order by DateTime) else FieldName end as prevFieldName

(2)--LAST_VALUE(FieldName IGNORE NULLS) OVER (ORDER BY DateTime

(3)--ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS prevFieldName

(4)-- first_value(FieldName)over(order by DateTime) as prevFieldName  

from table

EDIT: I added rows to the data and change row numbers

You can use following logic to achieve your goal. Sample Data creation:

WITH
Base AS
(
SELECT *
FROM(
SELECT 123 Row, 'one'  Field, '10:00 AM' DateTime  
UNION ALL
SELECT 123, null, '10:05 AM'     
UNION ALL
SELECT
123,    null,  '10:10 AM'
UNION ALL
SELECT
123   ,   'one'  , '10:30 AM'
UNION ALL
SELECT
456,null,'11:00 AM'
UNION ALL
SELECT
456,'two','11:15 AM'
UNION ALL
SELECT
789,'two','11:30 AM'))

Logic: The query grabs max and min for each field and also the lead and lag values for each row, based on that it determines the prevfield values.

SELECT a.Field,DateTime,
CASE WHEN a.DateTime = a.min_date THEN ''
WHEN  a.lag_field IS NOT NULL and a.lead_field IS NULL THEN a.lag_field
WHEN  a.lag_field IS NULL and a.lead_field IS NOT NULL THEN a.lead_field
WHEN a.lag_field != a.lead_field THEN a.lag_field
WHEN a.Field IS NOT NULL AND a.lag_field IS NULL AND a.lead_field IS NULL AND a.DateTime = a.Max_date THEN a.Field
ELSE ''
END as prevField
FROM(
SELECT Base.Field,DateTime,LAG(Base.Field) over (order by DateTime)lag_field,Lead(Base.Field) over (order by DateTime) lead_field,min_date,Max_date
From Base LEFT JOIN (SELECT Field,MIN(DateTime) min_date,MAX(DateTime) Max_date FROM Base Group by Field) b
ON Base.Field = b.Field
) a

This query partly solve my problem:

 CREATE TEMP FUNCTION ToHex(x INT64) AS (
      (SELECT STRING_AGG(FORMAT('%02x', x >> (byte * 8) & 0xff), '' ORDER BY byte DESC)
       FROM UNNEST(GENERATE_ARRAY(0, 7)) AS byte)
    );
    SELECT
      
         DateTime
         Field
        , SUBSTR(MAX( ToHex(row_n) || Field) OVER (ORDER BY row_n ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 17) AS previous
    FROM (
        SELECT *, ROW_NUMBER() over (ORDER BY DateTime) AS row_n
        FROM `xx.yy.zz`
    );

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM