簡體   English   中英

ROW_NUMBER() 與條件 BIGQUERY

[英]ROW_NUMBER() WITH CONDITION BIGQUERY

我真的很感激這方面的幫助。 我有一組購買旅游的數據。 每個旅游都有一個 Purchaser_Email 和 Event_Date 以及更多其他不相關的列。 我想要一個列旅行來確定該事件是新旅行還是同一次旅行。 要將新購買識別為新旅行,兩個 Event_Dates 之間的差異必須超過 30 天。 如果不是,那次旅行被認為是同一次旅行。 最后,我需要知道客戶進行了多少次旅行,並按旅行對購買進行分組。 我使用ROW_NUMBER()進行查詢並計算第一次購買和下一次購買之間的 date_diff。 我覺得我很接近,但我需要一些幫助來添加Trip Column。

我需要這樣的東西: Desired Colum

在這個文件中是示例數據集和我需要的列: https ://docs.google.com/spreadsheets/d/1ToNFQ9l2-ztDrN2zSlKlgBQk95vO6BnRv6VabWrHBmM/edit?usp=sharing RAW數據是第一個Tab,查詢的結果在帶有橙色列的第二個選項卡下方,紅色的最后一列是我要查找的列。

WITH NumberedDates AS (
SELECT
City
,Booking
,Purchase_Date
, Purchaser_Email
,Guest_Info
,Addr_1
,City_7
,State_Province
,Country
, Gross_Sales
, Event_Date
, Event_Name
, MIN(Event_Date) OVER (PARTITION BY Purchaser_Email) as minPurchDate
, ROW_NUMBER() OVER (PARTITION BY Purchaser_Email ORDER BY Event_Date) AS RowNo
FROM SalesEatingEurope.DymTable )



SELECT
n1.City
, n1.Booking
, n1.Purchase_Date
, n1.Purchaser_Email
, n1.Guest_Info
, n1.Addr_1
, n1.City_7
, n1.State_Province
, n1.Country
, n1.Gross_Sales
, n1.Event_Name
, n1.Event_Date
, n1.RowNo as TransactionNumber
, n2.Event_Date as PrevEventDate
, IFNULL(date_diff(EXTRACT(DATE FROM n2.Event_Date), EXTRACT(DATE FROM n1.Event_Date) ,day), 0)*-1 AS DaysSincePrevEvent
, n1.minPurchDate as FirstEvent
, IFNULL(date_diff( EXTRACT(DATE FROM n1.minPurchDate), EXTRACT(DATE FROM n1.Event_Date) ,day), 0)*-1 AS DaysSinceFirstEvent
FROM NumberedDates  AS n1
LEFT JOIN NumberedDates  AS n2
ON n1.Purchaser_Email = n2.Purchaser_Email
AND n1.RowNo = n2.RowNo + 1
ORDER BY n1.Purchaser_Email, n1.Event_Date

你是對的。 划分賦值row_number()rank()后,可以根據兩次購買滯后一定delta的情況賦值一個boolean參數。

這是實現此目的的一種方法:

with data as (
  select purchaser_email, event_date, rank() over (partition by purchaser_email order by event_date) as indx from (
    select 'abc_xyz@xyz.com' as purchaser_email, date('2018-10-15') as event_date union all
    select 'abc_xyz@xyz.com' as purchaser_email, date('2018-10-12') as event_date union all
    select 'abc_xyz@xyz.com' as purchaser_email, date('2018-10-19') as event_date union all
    select 'fgh_xyz@xyz.com' as purchaser_email, date('2018-10-03') as event_date union all
    select 'fgh_xyz@xyz.com' as purchaser_email, date('2018-10-10') as event_date union all
    select 'fgh_xyz@xyz.com' as purchaser_email, date('2018-11-26') as event_date union all
    select 'abc_xyz@xyz.com' as purchaser_email, date('2018-11-28') as event_date union all
    select 'abc_xyz@xyz.com' as purchaser_email, date('2018-12-30') as event_date union all
    select 'abc_xyz@xyz.com' as purchaser_email, date('2018-12-31') as event_date
  )
)
select purchaser_email, count(1) as order_count from (
  select purchaser_email, 
    d1, new_purchase, sum(case when new_purchase=true then 1 else 0 end) over (partition by purchaser_email order by d1) as purchase_count from (
    select 
      t1.purchaser_email, 
      t1.event_date as d1, 
      t2.event_date as d2, 
      t1.indx as t1i,
      t2.indx as t2i,
      case 
        when t2.event_date is null then true 
        when abs(date_diff(t1.event_date, t2.event_date, day)) >= 30 then true 
        else false end as new_purchase
      from data t1
      left join data t2 on t1.purchaser_email = t2.purchaser_email and t1.indx-1 = t2.indx
  )
  order by 1,2,3
)
where new_purchase = true
group by 1
order by 1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM