简体   繁体   中英

BigQuery Challenge: How can we backfill a table with multiple records in Google BigQuery?

Came across a challenging BigQuery problem and would like to bring all the brightest minds here. Here's the background, I am trying to match a membershipID to the Table A below using the Table B . It is very easy to map the membershipID to the timestamp 2 and timestamp 4 back to Table A using left join. However, I find it quite challenging to backfill the latest membershipID back to timestamp 1 and timestamp 3 below.

It is an odd question, so here's some more info:

  • One device can log in by multiple membershipID
  • Not all events in each timestamp can capture the membershipID, so that's why want to map the membershipID from the latest "login" event (ie Table B)

For timestamp 1, we want to map to MemberID_111 (as the session end at timestamp 2)

For timestamp 3, we want to map back MemberID_222 (as the session end at timestamp 4)

Table A : The main event table with different timestamps & events, but all of them happened on the same device.

timestamp event device
1 A device_A
2 B device_A
3 C device_A
4 A device_A

Table B : Collected the timestamp when the user logged in.

timestamp device membershipID
2 device_A MemberID_111
4 device_A MemberID_222

Output : The output table we wanted using BigQuery:

timestamp event device membershipID
1 A device_A MemberID_111
2 B device_A MemberID_111
3 C device_A MemberID_222
4 A device_A MemberID_222

Hopefully, my question is clear enough above, thanks everyone!

SELECT * EXCEPT(membershipID),
       FIRST_VALUE(membershipID IGNORE NULLS) OVER w AS membershipID
  FROM tableA LEFT JOIN tableB USING (device, timestamp)
WINDOW w AS (PARTITION BY device ORDER BY timestamp ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
 ORDER BY timestamp;

-- or
SELECT * EXCEPT(membershipID),
       LAST_VALUE(membershipID IGNORE NULLS) OVER w AS membershipID
  FROM tableA LEFT JOIN tableB USING (device, timestamp)
WINDOW w AS (PARTITION BY device ORDER BY timestamp DESC)
 ORDER BY timestamp;

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM