简体   繁体   English

使用SQL Server或Vertica按日期/月将记录按日期范围转换为记录

[英]Turning Records by Date Range into Records by Day/Month Using SQL Server or Vertica

I can use either SQL Server or Vertica as the DB and Tableau as the reporting tool. 我可以使用SQL Server或Vertica作为DB和Tableau作为报告工具。 A solution in any of these mediums would be helpful. 任何这些媒介的解决方案都会有所帮助。

DATA RESOURCES: I have a table (userActivity) with 100 records and a structure of: User, StartDate, EndDate 数据资源:我有一个包含100条记录的表(userActivity),结构为:User,StartDate,EndDate

NEED: I am interested in preparing reports by day and month that show "total active days", meaning if User1 has a range of '20180101' to '20180331', they will contribute one day for each day in Jan, Feb and Mar OR 31, 28 and 31 days if aggregated by month. 需要:我有兴趣按日和月编制报告,显示“总活跃天数”,这意味着如果User1的范围为'20180101'到'20180331',他们将在1月,2月和3月每天贡献一天或者如果按月汇总,则为31天,28天和31天。

GOAL: I will ultimately be aggregating the total active days of all users as the output to achieve a single total for each day/month. 目标:我最终会将所有用户的总活动天数作为输出汇总,以实现每天/每月的单一总计。

This report will span to perpetuity, so I would prefer solutions that don't hard code CASE/IF-THEN statements by day/month. 这个报告将延续到永久性,所以我更喜欢那些没有按日/月硬编码CASE / IF-THEN语句的解决方案。

Thanks! 谢谢!

While recursive CTEs are a good candidate for this scenario, it can be handled with tableau alone. 虽然递归CTE是这种情况的一个很好的候选者,但它可以单独使用tableau来处理。 Assuming you have this data, here are the steps required to produce the view. 假设您拥有此数据,以下是生成视图所需的步骤。

在此输入图像描述

  1. Create a reference sheet which has all the days expected. 创建一个包含所有预期日期的参考表。 Even if you need to cover 25 years from 01/01/2018 to 01/01/2043, that is still less than 10k rows. 即使你需要从01/01/2018到01/01/2043覆盖25年,这仍然不到10k行。

在此输入图像描述

You need two columns with exact same date as Tableau does not allow multiple join conditions on same column. 您需要两个具有完全相同日期的列,因为Tableau不允许在同一列上存在多个连接条件。

  1. Create an inner join between reference calendar and data using following criteria. 使用以下条件在参考日历和数据之间创建内部联接。 在此输入图像描述

  2. Build the view 构建视图

    在此输入图像描述

Use Vertica - it has the TIMESERIES clause - no recursion needed. 使用Vertica - 它具有TIMESERIES子句 - 不需要递归。

I would try the below - and check the intermediate results of the Common Table Expressions to see how it works.. 我会尝试以下 - 并检查公用表表达式的中间结果,看看它是如何工作的..

WITH 
-- two test rows ....
input(uid,start_dt,end_dt) AS (
            SELECT 1,DATE '2018-01-01', DATE '2018-03-31'
  UNION ALL SELECT 2,DATE '2018-02-01', DATE '2018-04-01'
)
,
-- set the stage for Vertica's TIMESERIES clause
-- note: TIMESERIES relies on timestamps ...
limits(uid,lim_dt,qty) AS (
  SELECT
    uid
  , start_dt::TIMESTAMP
  , 1
  FROM input
  UNION ALL
  SELECT
    uid
  , end_dt::TIMESTAMP
  , 1
  FROM input
)
,
-- apply the Vertica TIMESERIES clause
counters AS (
  SELECT
    uid
  , act_dt
  , TS_FIRST_VALUE(qty) AS qty
  FROM limits
  TIMESERIES act_dt AS '1 DAY' OVER(PARTITION BY uid ORDER BY lim_dt)
)
SELECT
  uid
, MONTH(act_dt) AS activity_month
, SUM(qty)
FROM counters
GROUP BY 1,2;
-- out  uid | activity_month | sum 
-- out -----+----------------+-----
-- out    1 |              1 |  31
-- out    1 |              2 |  28
-- out    1 |              3 |  31
-- out    2 |              2 |  28
-- out    2 |              3 |  31
-- out    2 |              4 |   1
-- out (6 rows)
-- out 
-- out time: first fetch (6 rows): 120.515 ms. all rows formatted: 120.627 ms

Solution: 解:

WITH base AS (
  SELECT
     User       AS u
    ,StartDate  AS s
    ,EndDate    AS e
    ,DATEDIFF(
      dd,
      StartDate,
      EndDate
      )+1       AS d
  FROM  userActivity
  ),
recurse AS (
  SELECT    u, s, e, d, x=(d-1)
    FROM    base
    UNION ALL
    SELECT  u, s, e, d, x-1 AS x
    FROM    recurse
    WHERE   x>0
  )
SELECT      u, DATEADD(dd, x, s) AS recordperday
FROM        recurse
ORDER BY    u, recordperday
--Extends SQL Server's recursion limit
OPTION (MAXRECURSION 500)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM