I have a table in BQ that looks like this:
Row Field DateTime
1 one 10:00 AM
2 null 10:05 AM
3 null 10:10 AM
4 one 10:30 AM
5 null 11:00 AM
6 two 11:15 AM
7 two 11:30 AM
8 null 11:35 AM
9 null 11:40 AM
10 null 11:50 AM
11 null 12:00 AM
12 null 12:15 AM
13 two 12:30 AM
14 null 12:15 AM
15 null 12:25 AM
16 null 12:35 AM
17 three 12:55 AM
I want to create another column called prevField and fill it out with the last Field value that is not null, when the first and last entry around the null are the same. When the first and last entry around null are different, it should remain null. The result would look like the following:
Row Field DateTime prevField
1 one 10:00 AM null
2 null 10:05 AM one
3 null 10:10 AM one
4 one 10:30 AM one
5 null 11:00 AM null
6 two 11:15 AM two
7 two 11:30 AM two
8 null 11:35 AM two
9 null 11:40 AM two
10 null 11:50 AM two
11 null 12:00 AM two
12 null 12:15 AM two
13 two 12:30 AM two
14 null 12:15 AM null
15 null 12:15 AM null
16 null 12:15 AM null
17 three 12:15 AM three
So far i tried the following code variations for first part of the question (fill out prevField with the last Field value that is not null, when the first and last entry around the null are the same) but without success.
select Field, Datetime,
(1)--case when FieldName is null then LAG(FieldName) over (order by DateTime) else FieldName end as prevFieldName
(2)--LAST_VALUE(FieldName IGNORE NULLS) OVER (ORDER BY DateTime
(3)--ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) AS prevFieldName
(4)-- first_value(FieldName)over(order by DateTime) as prevFieldName
from table
EDIT: I added rows to the data and change row numbers
You can use following logic to achieve your goal. Sample Data creation:
WITH
Base AS
(
SELECT *
FROM(
SELECT 123 Row, 'one' Field, '10:00 AM' DateTime
UNION ALL
SELECT 123, null, '10:05 AM'
UNION ALL
SELECT
123, null, '10:10 AM'
UNION ALL
SELECT
123 , 'one' , '10:30 AM'
UNION ALL
SELECT
456,null,'11:00 AM'
UNION ALL
SELECT
456,'two','11:15 AM'
UNION ALL
SELECT
789,'two','11:30 AM'))
Logic: The query grabs max and min for each field and also the lead and lag values for each row, based on that it determines the prevfield values.
SELECT a.Field,DateTime,
CASE WHEN a.DateTime = a.min_date THEN ''
WHEN a.lag_field IS NOT NULL and a.lead_field IS NULL THEN a.lag_field
WHEN a.lag_field IS NULL and a.lead_field IS NOT NULL THEN a.lead_field
WHEN a.lag_field != a.lead_field THEN a.lag_field
WHEN a.Field IS NOT NULL AND a.lag_field IS NULL AND a.lead_field IS NULL AND a.DateTime = a.Max_date THEN a.Field
ELSE ''
END as prevField
FROM(
SELECT Base.Field,DateTime,LAG(Base.Field) over (order by DateTime)lag_field,Lead(Base.Field) over (order by DateTime) lead_field,min_date,Max_date
From Base LEFT JOIN (SELECT Field,MIN(DateTime) min_date,MAX(DateTime) Max_date FROM Base Group by Field) b
ON Base.Field = b.Field
) a
This query partly solve my problem:
CREATE TEMP FUNCTION ToHex(x INT64) AS (
(SELECT STRING_AGG(FORMAT('%02x', x >> (byte * 8) & 0xff), '' ORDER BY byte DESC)
FROM UNNEST(GENERATE_ARRAY(0, 7)) AS byte)
);
SELECT
DateTime
Field
, SUBSTR(MAX( ToHex(row_n) || Field) OVER (ORDER BY row_n ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 17) AS previous
FROM (
SELECT *, ROW_NUMBER() over (ORDER BY DateTime) AS row_n
FROM `xx.yy.zz`
);
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.