簡體   English   中英

如何使用滯后功能跳過行? PostgreSQL 9.3

[英]How do I use lag function to skip a row? PostgreSQL 9.3

我目前正在嘗試從表中列出一個查詢用戶上網行為的查詢。 該表如下圖所示

**RecordID  RespondentID    DeviceID    UTCTimestamp    Domain**
1   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:21    goodreads.com 
2   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:21    goodreads.com 
3   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:21    gr-assets.com 
4   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:21    gr-assets.com 
5   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:23    itunes.apple.com 
6   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:23    itunes.apple.com 
7   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:51    samplicio.us 
8   01faca75-1216-4a55-b43c-9d64ade852f7    4DF57C06-F0BD-4779-8983-37A8B02E5EDF    06/11/2017 10:51    samplicio.us

多虧了大家的幫助,我才得以做到。

RecordID RespondentID UTCTimestamp源域到域RecordID

2   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:21    goodreads.com   gr-assets.com   3
4   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:21    gr-assets.com   itunes.apple.com    5
6   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:23    itunes.apple.com    samplicio.us    7

“到域”是域名不同的下一行的值。

問題雖然看起來正確,但實際上我們跳過了整個第一條記錄。 這是因為,在給定數據集的情況下,第一行“域”被連接到第二行“域”,而我們跳過了它。 第2行與第3行結合在一起,因此第一個結果記錄顯示的是RecordID2。我想對此做進一步的調整。 由於域相同,因此我的結果應從RecordID 1開始並跳過RecordID 2,因此結果應顯示為

RecordID RespondentID UTCTimestamp源域到域

1   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:21    goodreads.com   gr-assets.com   
3   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:21    gr-assets.com   itunes.apple.com    
5   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:23    itunes.apple.com    samplicio.us

我嘗試跳過RecordID 2,但是,遇到SQL錯誤'prev_nane'。

SELECT t1."RecordID", t1."RespondentID", t1."UTCTimestamp", t1."Domain" as "Source Domain", t2."Domain" as "To Domain" , t2."RecordID", lag(t1."Domain",1) over (order by t1."RecordID") as prev_name
from public."Traffic - Mobile" as t1
  join public."Traffic - Mobile" as t2 on t2."RespondentID" = t1."RespondentID" AND t2."DeviceID"=t1."DeviceID" AND t2."RecordID"=t1."RecordID"+1  And t1."Domain"<>T2."Domain" AND t2."UTCTimestamp">=t1."UTCTimestamp" AND t2."Sequence"-t1."Sequence"=1 and t1."RecordID"<13 AND t1."Domain"<>prev_name;

我做錯了什么?

我最終要達到的最終結果也要低於RecordID RespondentID UTCTimestamp源域到域的最終目標

1   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:21    goodreads.com   gr-assets.com   samplicio.us
3   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:21    gr-assets.com   itunes.apple.com    samplicio.us
5   01faca75-1216-4a55-b43c-9d64ade852f7    06/11/2017 10:23    itunes.apple.com    samplicio.us    samplicio.us

另一列稱為“最終目的地”。 這是為了讓我將3個事務分組在一起,作為到達samplicio.us的路徑。

提前致謝。

嘗試這個:

with t1 as
(
select 
recordid,
respondentid,
deviceid,
utctimestamp,
domain,
row_number() over (partition by 
                   respondentid,
                   deviceid 
                   order by 
                   utctimestamp,
                   recordid) as user_seq,
row_number() over (partition by 
                   respondentid,
                   deviceid,
                   domain
                   order by 
                   utctimestamp,
                   recordid) as user_domain_seq
from traffic_mobile)

select *
from
(
select 
recordid,
respondentid,
deviceid,
utctimestamp,
domain,
lead(domain) over ( partition by 
                    respondentid,
                    deviceid order by 
                   user_seq) as next_domain,
last_value(domain) over( partition by
                      respondentid,
                      deviceid order by user_seq 
                      rows between unbounded preceding
                      and unbounded following ) 
                      as final_domain
from t1
where 
user_domain_seq = 1 ) t2
where t2.next_domain is not null

sqlfiddle: sqlfiddle.com/#! 17/ dc248/3

PS。 對於僅在traffic_mobile表上具有1個條目的用戶,該查詢將不會返回行。 如果需要,將需要改進查詢以包括它們。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM