简体   繁体   中英

Updating query to return most recent revenue date/value prior to previous year/quarter date/values (snowflake)

I have written the following query. Joins f7 and f8 are there because sometimes the revenue for previous quarter/year is NULL, however just for that day. If the revenue 15 days prior was positive, then we know it was still an active account, and the NULL was due to a temporary lapse in the contract.

Anyway, I'm trying to update this so that instead of 15 days prior to the previous quarter and year for each day, I get the last actual revenue value prior to the previous quarter/year date. I'm not sure if this is possible, because the join would be on a different date for each account. So maybe another approach is needed. Any help would be appreciated.

Let me know if I've explained this sufficiently.

with
          arr_base as (select * from arr_base_table opp)
          ,cte_accounts as (select distinct account_id,
                                  account_name
                                  ,account_owner_name
                                  ,account_region_c
                                  ,account_theater_c
                                  ,owner_theater_c
                                  ,customer_first_purchase_date
                                  ,cohort_date
                                  from arr_base)
          ,cte_account_product_info as (select account_id
                                               ,account_name
                                               ,activity_date
                                               ,line_item_count
                                               ,has_casb_count
                                               ,has_casb_api_count
                                               ,has_casb_inline_count
                                               ,has_swg_count
                                               ,has_ng_swg_count
                                               ,has_swg_all_count
                                               ,has_npa_count
                                               ,has_iaas_count
                                               ,has_dlp_count
                                               ,has_dlp_adv_count
                                               ,has_dlp_std_count
                                               ,has_firewall_count
                                               ,has_cspm_count
                                               ,has_email_count
                                               ,has_rbi_count
                                               ,has_support_count
                                               ,npa_user_count
                                               ,is_casb_customer
                                               ,is_swg_customer
                                               ,is_npa_customer
                                               ,is_firewall_customer
                                               ,number_of_products
                                               ,customer_has_two_or_more_products
                                               from arr_base)
          ,cte_dates as (select distinct activity_date from arr_base)
          ,cte_arr as (select account_id
                              ,account_name
                              ,activity_date
                              ,arr
                              ,casb_api_arr
                              ,casb_inline_arr
                              ,casb_combined_arr
                              ,swg_arr
                              ,ng_swg_packages_arr
                              ,swg_combined_arr
                              ,cspm_arr
                              ,firewall_arr
                              ,iaas_storage_scan_arr
                              ,npa_arr
                              ,email_arr
                              ,rbi_arr
                              ,dlp_arr
                              ,dlp_std_arr
                              ,dlp_adv_arr
                              ,support_arr

          from arr_base)

        -- cartesian product
        select
          dim.activity_date
          ,dateadd(year,-1,dim.activity_date) as prev_year_date
          ,add_months(dim.activity_date, -3) as prev_quar_date
          ,dim.account_id
          ,dim.account_name
          ,dim.account_owner_name
          ,dim.account_region_c
          ,dim.account_theater_c
          ,dim.owner_theater_c
          ,dim.customer_first_purchase_date
          ,dim.cohort_date
          ,f4.line_item_count
          ,f5.line_item_count as line_item_count_prev_year
          ,f6.line_item_count as line_item_count_prev_quarter
          ,f1.arr as arr_current_year
          ,f2.arr as arr_prev_year
          ,f3.arr as arr_prev_quarter
          ,f7.arr as arr_prev_year_plus15
          ,f8.arr as arr_prev_quarter_plus15
        from
        (
         select
          a.*
          ,d.activity_date
         from cte_accounts a cross join cte_dates d
        ) as dim
              left outer join cte_arr f1 on dim.account_id = f1.account_id and dim.activity_date = f1.activity_date
              left outer join cte_arr f2 on dim.account_id = f2.account_id and (dateadd(year,-1,dim.activity_date) = f2.activity_date)
              left outer join cte_arr f3 on dim.account_id = f3.account_id and (add_months(dim.activity_date, -3) = f3.activity_date)
              left outer join cte_account_product_info f4 on dim.account_id = f4.account_id and dim.activity_date = f4.activity_date
              left outer join cte_account_product_info f5 on dim.account_id = f5.account_id and (dateadd(year,-1,dim.activity_date) = f5.activity_date)
              left outer join cte_account_product_info f6 on dim.account_id = f6.account_id and (add_months(dim.activity_date, -3) = f6.activity_date)
              left outer join cte_arr f7 on dim.account_id = f7.account_id and (dateadd(day,15,(dateadd(year,-1,dim.activity_date))) = f7.activity_date)
              left outer join cte_arr f8 on dim.account_id = f8.account_id and (dateadd(day,15,(add_months(dim.activity_date, -3))) = f8.activity_date)
        order by
          dim.activity_date
          ,dim.account_id

Adding current results and desired results. Only including relevant columns in sample data. For account 2, arr_prev_year is NULL because there was no revenue received in Jan. 2020 for that account. arr_prev_year_plus15 is also NULL, as no revenue was received for the entire month of January.

In the desired results, prior to Jan. 31, 2020, account 2 most recently had revenue received on Dec. 31, 2019. So that date, and the corresponding revenue is returned in the prev_year_most_recent_date and arr_prev_year_most_recent columns.

Current Results

Activity_date Prev_year_date Prev_quar_date prev_year_plus15_date prev_quar_plus15_date account_id arr_current_year arr_prev_year arr_prev_quarter arr_prev_year_plus15 arr_prev_quarter_plus15
Jan. 31, 2021 Jan. 31, 2020 Oct. 31, 2020 Jan. 16, 2020 Oct. 16, 2020 1 100 90 95 90 95
Jan. 31, 2021 Jan. 31, 2020 Oct. 31, 2020 Jan. 16, 2020 Oct. 16, 2020 2 100 NULL 80 NULL 80

Desired results:

Activity_date Prev_year_date Prev_quar_date prev_year_most_recent_active_date prev_quarter_most_recent_active_date account_id arr_current_year arr_prev_year arr_prev_quarter arr_prev_year_most_recent arr_prev_quarter_most_recent
Jan. 31, 2021 Jan. 31, 2020 Oct. 31, 2020 Jan. 30, 2020 Oct. 30, 2020 1 100 90 95 90 95
Jan. 31, 2021 Jan. 31, 2020 Oct. 31, 2020 Dec. 31, 2019 Oct. 30, 2020 2 100 NULL 80 75 80

So I would firstly rewrite your current SQL as follows.

The major point is select the columns you want and avoid * , don't use functions on joins/where clauses.

with arr_base as (
    select 
        account_id
        ,account_name
        ,account_owner_name
        ,account_region_c
        ,account_theater_c
        ,owner_theater_c
        ,customer_first_purchase_date
        ,cohort_date
        
        ,activity_date
        ,line_item_count
        ,has_casb_count
        ,has_casb_api_count
        ,has_casb_inline_count
        ,has_swg_count
        ,has_ng_swg_count
        ,has_swg_all_count
        ,has_npa_count
        ,has_iaas_count
        ,has_dlp_count
        ,has_dlp_adv_count
        ,has_dlp_std_count
        ,has_firewall_count
        ,has_cspm_count
        ,has_email_count
        ,has_rbi_count
        ,has_support_count
        ,npa_user_count
        ,is_casb_customer
        ,is_swg_customer
        ,is_npa_customer
        ,is_firewall_customer
        ,number_of_products
        ,customer_has_two_or_more_products
        
        ,arr
        ,casb_api_arr
        ,casb_inline_arr
        ,casb_combined_arr
        ,swg_arr
        ,ng_swg_packages_arr
        ,swg_combined_arr
        ,cspm_arr
        ,firewall_arr
        ,iaas_storage_scan_arr
        ,npa_arr
        ,email_arr
        ,rbi_arr
        ,dlp_arr
        ,dlp_std_arr
        ,dlp_adv_arr
        ,support_arr
        
    from arr_base_table
), cte_accounts as (
    select distinct 
        account_id
        ,account_name
        ,account_owner_name
        ,account_region_c
        ,account_theater_c
        ,owner_theater_c
        ,customer_first_purchase_date
        ,cohort_date
    from arr_base
), cte_account_product_info as (
    select 
        account_id
        ,account_name
        ,activity_date
        ,line_item_count
        ,has_casb_count
        ,has_casb_api_count
        ,has_casb_inline_count
        ,has_swg_count
        ,has_ng_swg_count
        ,has_swg_all_count
        ,has_npa_count
        ,has_iaas_count
        ,has_dlp_count
        ,has_dlp_adv_count
        ,has_dlp_std_count
        ,has_firewall_count
        ,has_cspm_count
        ,has_email_count
        ,has_rbi_count
        ,has_support_count
        ,npa_user_count
        ,is_casb_customer
        ,is_swg_customer
        ,is_npa_customer
        ,is_firewall_customer
        ,number_of_products
        ,customer_has_two_or_more_products
    from arr_base
), cte_dates as (
    select distinct 
        activity_date
    from arr_base
), cte_arr as (
    select 
        account_id
        ,account_name
        ,activity_date
        ,arr
        ,casb_api_arr
        ,casb_inline_arr
        ,casb_combined_arr
        ,swg_arr
        ,ng_swg_packages_arr
        ,swg_combined_arr
        ,cspm_arr
        ,firewall_arr
        ,iaas_storage_scan_arr
        ,npa_arr
        ,email_arr
        ,rbi_arr
        ,dlp_arr
        ,dlp_std_arr
        ,dlp_adv_arr
        ,support_arr
    from arr_base
), dim_data AS (
    select
        a.account_id
        ,a.account_name
        ,a.account_owner_name
        ,a.account_region_c
        ,a.account_theater_c
        ,a.owner_theater_c
        ,a.customer_first_purchase_date
        ,a.cohort_date
        ,d.activity_date
        ,dateadd(year, -1, d.activity_date) as prev_year_date
        ,dateadd(month, -3, d.activity_date) as prev_quar_date
        ,dateadd(day, 15, prev_year_date) as prev_year_plus15d_date
        ,dateadd(day, 15, prev_quar_date) as prev_quar_plus15d_date
    from cte_accounts a 
    cross join cte_dates d
)
select
  dim.activity_date
  ,dim.prev_year_date
  ,dim.prev_quar_date
  ,dim.account_id
  ,dim.account_name
  ,dim.account_owner_name
  ,dim.account_region_c
  ,dim.account_theater_c
  ,dim.owner_theater_c
  ,dim.customer_first_purchase_date
  ,dim.cohort_date
  ,f4.line_item_count
  ,f5.line_item_count as line_item_count_prev_year
  ,f6.line_item_count as line_item_count_prev_quarter
  ,f1.arr as arr_current_year
  ,f2.arr as arr_prev_year
  ,f3.arr as arr_prev_quarter
  ,f7.arr as arr_prev_year_plus15
  ,f8.arr as arr_prev_quarter_plus15
from dim_data as dim
left outer join cte_arr f1 
    on dim.account_id = f1.account_id and dim.activity_date = f1.activity_date
left outer join cte_arr f2 
    on dim.account_id = f2.account_id and dim.prev_year_date = f2.activity_date
left outer join cte_arr f3 
    on dim.account_id = f3.account_id and dim.prev_quar_date = f3.activity_date
left outer join cte_account_product_info f4 
    on dim.account_id = f4.account_id and dim.activity_date = f4.activity_date
left outer join cte_account_product_info f5 
    on dim.account_id = f5.account_id and dim.prev_year_date = f5.activity_date
left outer join cte_account_product_info f6 
    on dim.account_id = f6.account_id and dim.prev_quar_date = f6.activity_date
left outer join cte_arr f7 
    on dim.account_id = f7.account_id and dim.prev_year_plus15d_date = f7.activity_date
left outer join cte_arr f8 
    on dim.account_id = f8.account_id and dim.prev_quar_plus15d_date = f8.activity_date
order by dim.activity_date, dim.account_id

after pulling all the cloumns into the select, the unused can be removed giving

with arr_base as (
    select 
        account_id
        ,account_name
        ,account_owner_name
        ,account_region_c
        ,account_theater_c
        ,owner_theater_c
        ,customer_first_purchase_date
        ,cohort_date
        
        ,activity_date
        ,line_item_count
        
        ,arr

    from arr_base_table
), cte_accounts as (
    select distinct 
        account_id
        ,account_name
        ,account_owner_name
        ,account_region_c
        ,account_theater_c
        ,owner_theater_c
        ,customer_first_purchase_date
        ,cohort_date
    from arr_base
), cte_account_product_info as (
    select 
        account_id
        ,activity_date
        ,line_item_count
    from arr_base
), cte_dates as (
    select distinct 
        activity_date
    from arr_base
), cte_arr as (
    select 
        account_id
        ,activity_date
        ,arr
    from arr_base
), dim_data AS (
    select
        a.account_id
        ,a.account_name
        ,a.account_owner_name
        ,a.account_region_c
        ,a.account_theater_c
        ,a.owner_theater_c
        ,a.customer_first_purchase_date
        ,a.cohort_date
        ,d.activity_date
        ,dateadd(year, -1, d.activity_date) as prev_year_date
        ,dateadd(month, -3, d.activity_date) as prev_quar_date
        ,dateadd(day, 15, prev_year_date) as prev_year_plus15d_date
        ,dateadd(day, 15, prev_quar_date) as prev_quar_plus15d_date
    from cte_accounts a 
    cross join cte_dates d
)
select
  dim.activity_date
  ,dim.prev_year_date
  ,dim.prev_quar_date
  ,dim.account_id
  ,dim.account_name
  ,dim.account_owner_name
  ,dim.account_region_c
  ,dim.account_theater_c
  ,dim.owner_theater_c
  ,dim.customer_first_purchase_date
  ,dim.cohort_date
  ,f4.line_item_count
  ,f5.line_item_count as line_item_count_prev_year
  ,f6.line_item_count as line_item_count_prev_quarter
  ,f1.arr as arr_current_year
  ,f2.arr as arr_prev_year
  ,f3.arr as arr_prev_quarter
  ,f7.arr as arr_prev_year_plus15
  ,f8.arr as arr_prev_quarter_plus15
from dim_data as dim
left outer join cte_arr f1 
    on dim.account_id = f1.account_id and dim.activity_date = f1.activity_date
left outer join cte_arr f2 
    on dim.account_id = f2.account_id and dim.prev_year_date = f2.activity_date
left outer join cte_arr f3 
    on dim.account_id = f3.account_id and dim.prev_quar_date = f3.activity_date
left outer join cte_account_product_info f4 
    on dim.account_id = f4.account_id and dim.activity_date = f4.activity_date
left outer join cte_account_product_info f5 
    on dim.account_id = f5.account_id and dim.prev_year_date = f5.activity_date
left outer join cte_account_product_info f6 
    on dim.account_id = f6.account_id and dim.prev_quar_date = f6.activity_date
left outer join cte_arr f7 
    on dim.account_id = f7.account_id and dim.prev_year_plus15d_date = f7.activity_date
left outer join cte_arr f8 
    on dim.account_id = f8.account_id and dim.prev_quar_plus15d_date = f8.activity_date
order by dim.activity_date, dim.account_id

But really what you are after seems to be:

with arr_base as (
    select 
        account_id
        ,account_name
        ,account_owner_name
        ,account_region_c
        ,account_theater_c
        ,owner_theater_c
        ,customer_first_purchase_date
        ,cohort_date
        
        ,activity_date
        ,line_item_count
        
        ,arr

    from arr_base_table
), cte_accounts as (
    select distinct 
        account_id
        ,account_name
        ,account_owner_name
        ,account_region_c
        ,account_theater_c
        ,owner_theater_c
        ,customer_first_purchase_date
        ,cohort_date
    from arr_base
), cte_account_product_info as (
    select 
        account_id
        ,activity_date
        ,line_item_count
    from arr_base
), cte_dates as (
    select distinct 
        activity_date
    from arr_base
), cte_arr as (
    select 
        account_id
        ,activity_date
        ,arr
    from arr_base
), cte_make_sure_only_one_arr_per_day AS (
    select 
        account_id
        ,activity_date
        ,dateadd(year, -1, activity_date) as prior_year_date
        ,dateadd(month, -3, activity_date) as prior_quater_date 
        ,max(arr) as arr
    from cte_arr
    group by 1,2
), cte_prior_year_arrs AS (
    SELECT 
        a.account_id
        a.activity_date
        b.activity_date as prior_year_activity_date
        b.arr as prior_year_arr
    FROM cte_make_sure_only_one_arr_per_day AS a
    JOIN cte_make_sure_only_one_arr_per_day AS b 
        ON a.account_id = b.account_id AND b.activity_date < a.prior_year_date
    QUALIFY ROW_NUMBER() OVER (PARTITION BY a.account_id, a.activity_date ORDER BY b.activity_date DESC) = 1
), cte_prior_quarter_arrs AS (
    SELECT 
        a.account_id
        a.activity_date
        b.activity_date as prior_quarter_activity_date
        b.arr as prior_quarter_arr
    FROM cte_make_sure_only_one_arr_per_day AS a
    JOIN cte_make_sure_only_one_arr_per_day AS b 
        ON a.account_id = b.account_id AND b.activity_date < a.prior_quater_date
    QUALIFY ROW_NUMBER() OVER (PARTITION BY a.account_id, a.activity_date ORDER BY b.activity_date DESC) = 1
), dim_data AS (
    select
        a.account_id
        ,a.account_name
        ,a.account_owner_name
        ,a.account_region_c
        ,a.account_theater_c
        ,a.owner_theater_c
        ,a.customer_first_purchase_date
        ,a.cohort_date
        ,d.activity_date
        ,dateadd(year, -1, d.activity_date) as prev_year_date
        ,dateadd(month, -3, d.activity_date) as prev_quar_date
    from cte_accounts a 
    cross join cte_dates d
)
select
  dim.activity_date
  ,f7.prior_year_activity_date as prev_year_date
  ,f8.prior_quarter_activity_date as prev_quar_date
  ,dim.account_id
  ,dim.account_name
  ,dim.account_owner_name
  ,dim.account_region_c
  ,dim.account_theater_c
  ,dim.owner_theater_c
  ,dim.customer_first_purchase_date
  ,dim.cohort_date
  ,f4.line_item_count
  ,f5.line_item_count as line_item_count_prev_year
  ,f6.line_item_count as line_item_count_prev_quarter
  ,f1.arr as arr_current_year
  ,f2.arr as arr_prev_year
  ,f3.arr as arr_prev_quarter
  ,f7.prior_year_arr as arr_prev_year_plus15
  ,f8.prior_quarter_arr as arr_prev_quarter_plus15
from dim_data as dim
left outer join cte_arr f1 
    on dim.account_id = f1.account_id and dim.activity_date = f1.activity_date
left outer join cte_arr f2 
    on dim.account_id = f2.account_id and dim.prev_year_date = f2.activity_date
left outer join cte_arr f3 
    on dim.account_id = f3.account_id and dim.prev_quar_date = f3.activity_date
left outer join cte_account_product_info f4 
    on dim.account_id = f4.account_id and dim.activity_date = f4.activity_date
left outer join cte_account_product_info f5 
    on dim.account_id = f5.account_id and dim.prev_year_date = f5.activity_date
left outer join cte_account_product_info f6 
    on dim.account_id = f6.account_id and dim.prev_quar_date = f6.activity_date
left outer join cte_prior_year_arrs f7 
    on dim.account_id = f7.account_id and dim.activity_date = f7.activity_date
left outer join cte_prior_quarter_arrs f8 
    on dim.account_id = f8.account_id and dim.activity_date = f8.activity_date
order by dim.activity_date, dim.account_id

This will have a "prior" day for each current day that has activity. But is you are want the prior day for the -1 year or -1 quarter to a current day without date, then cte_make_sure_only_one_arr_per_day will need to be replaced with dim_data

But I think this shows the way to get the data you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM