簡體   English   中英

根據來自同一表的另一條記錄的值更新值

[英]Update value based on value from another record of same table

在這里,我有一個網站訪問者的示例表。 正如我們所見,有時訪問者不提供他們的電子郵件。 此外,他們可能會在一段時間內切換到不同的電子郵件地址。

**

  • 原表:

**在此處輸入圖片說明

我想根據以下要求更新此表:

  1. 當訪問者第一次提供電子郵件時,他過去的所有訪問都將被標記到該電子郵件
  2. 此外,他以后的所有訪問都將標記到該電子郵件,直到他切換到另一封電子郵件。

**

  • 更新后的預期表:

**在此處輸入圖片說明

我想知道是否有辦法在 Redshift 或 T-Sql 中做到這一點?

謝謝大家!

如果我們假設表的名稱是Visits並且該表的主鍵由列Visitor_idActivity_Date那么您可以在 T-SQL 中執行以下操作:

  • 使用相關子查詢:
update a
set a.Email = coalesce(
  -- select the email used previously
  (
    select top 1 Email from Visits
    where Email is not null and Activity_Date < a.Activity_Date and Visitor_id = a.Visitor_id
    order by Activity_Date desc
  ),
  -- if there was no email used previously then select the email used next
  (
    select top 1 Email from Visits
    where Email is not null and Activity_Date > a.Activity_Date and Visitor_id = a.Visitor_id
    order by Activity_Date
  )
)
from Visits a
where a.Email is null;
  • 使用窗口函數提供排序:
update v
set Email = vv.Email
from Visits v
  join (
    select
      v.Visitor_id,
      coalesce(a.Email, b.Email) as Email,
      v.Activity_Date,
      row_number() over (partition by v.Visitor_id, v.Activity_Date
                         order by a.Activity_Date desc, b.Activity_Date) as Row_num
    from Visits v
      -- previous visits with email
      left join Visits a
        on a.Visitor_id = v.Visitor_id
        and a.Email is not null
        and a.Activity_Date < v.Activity_Date
      -- next visits with email if there are no previous visits
      left join Visits b
        on b.Visitor_id = v.Visitor_id
        and b.Email is not null
        and b.Activity_Date > v.Activity_Date
        and a.Visitor_id is null
    where v.Email is null
  ) vv
    on vv.Visitor_id = v.Visitor_id
    and vv.Activity_Date = v.Activity_Date
where
  vv.Row_num = 1;

對於每個visitor_id,您可以使用以前的非空值更新空電子郵件值。 如果沒有,您將使用下一個非空值。您可以按如下方式獲取這些值:

select 
    v.*, v_prev.email prev_email, v_next.email next_email
from
    visits v
    left join visits v_prev on v.visitor_id = v_prev.visitor_id 
        and v_prev.activity_date = (select max(v2.activity_date) from visits v2 where v2.visitor_id = v.visitor_id and v2.activity_date < v.activity_date and v2.email is not null)
    left join visits v_next on v.visitor_id = v_next.visitor_id 
        and v_next.activity_date = (select min(v2.activity_date) from visits v2 where v2.visitor_id = v.visitor_id and v2.activity_date > v.activity_date and v2.email is not null)
where 
    v.email is null

在 SQL Server 或 Redshift 中,您可以使用子查詢來計算電子郵件:

select t.*,
       coalesce(email,
                max(email) over (partition by visitor_id, grp),
                max(case when activity_date = first_email_date then email end) over (partition by visitor_id)
                )
from (select t.*,
             min(case when email is not null then activity_date end) over 
                  (partition by visitor_id order by activity_date rows between unbounded preceding and current row) as first_email_date,
             count(email) over (partition by visitor_id order by activity_date between unbounded preceding and current row) as grp
      from t
     ) t;

然后您可以在更新中使用它:

更新 t set emai = tt.imputed_email from (select t. ,coalesce(email, max(email) over (partition byvisitor_id, grp), max(case when activity_date = first_email_date then email end) over (partition byvisitor_id) ) as imputed_email from (select t. , min(case when email is not null then activity_date end) over
(partition byvisitor_id order by activity_date) 作為 first_email_date,count(email) over (partition byvisitor_id order by activity_date) 作為 grp from t ) t ) tt 其中 tt.visitor_id = t.visitor_id 和 tt.activity_date = t.activity_date 和 t .email 為空;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM