简体   繁体   English

PostgreSQL交叉表generate_series周的列

[英]PostgreSQL Crosstab generate_series of weeks for columns

From a table of "time entries" I'm trying to create a report of weekly totals for each user. 我正在尝试从“时间条目”表中为每个用户创建每周总计的报告。

Sample of the table: 表样本:

+-----+---------+-------------------------+--------------+
| id  | user_id | start_time              | hours_worked |
+-----+---------+-------------------------+--------------+
| 997 | 6       | 2018-01-01 03:05:00 UTC | 1.0          |
| 996 | 6       | 2017-12-01 05:05:00 UTC | 1.0          |
| 998 | 6       | 2017-12-01 05:05:00 UTC | 1.5          |
| 999 | 20      | 2017-11-15 19:00:00 UTC | 1.0          |
| 995 | 6       | 2017-11-11 20:47:42 UTC | 0.04         |
+-----+---------+-------------------------+--------------+

Right now I can run the following and basically get what I need 现在,我可以运行以下内容,基本上可以满足需要

SELECT COALESCE(SUM(time_entries.hours_worked),0) AS total, 
  time_entries.user_id, 
  week::date

--Using generate_series here to account for weeks with no time entries when
--doing the join

FROM generate_series( (DATE_TRUNC('week', '2017-11-01 00:00:00'::date)),
                      (DATE_TRUNC('week', '2017-12-31 23:59:59.999999'::date)),
                      interval '7 day') as week LEFT JOIN time_entries
ON DATE_TRUNC('week', time_entries.start_time) = week

GROUP BY week, time_entries.user_id
ORDER BY week

This will return 这将返回

+-------+---------+------------+
| total | user_id | week       |
+-------+---------+------------+
| 14.08 | 5       | 2017-10-30 |
| 21.92 | 6       | 2017-10-30 |
| 10.92 | 7       | 2017-10-30 |
| 14.26 | 8       | 2017-10-30 |
| 14.78 | 10      | 2017-10-30 |
| 14.08 | 13      | 2017-10-30 |
| 15.83 | 15      | 2017-10-30 |
| 8.75  | 5       | 2017-11-06 |
| 10.53 | 6       | 2017-11-06 |
| 13.73 | 7       | 2017-11-06 |
| 14.26 | 8       | 2017-11-06 |
| 19.45 | 10      | 2017-11-06 |
| 15.95 | 13      | 2017-11-06 |
| 14.16 | 15      | 2017-11-06 |
| 1.00  | 20      | 2017-11-13 |
| 0     |         | 2017-11-20 |
| 2.50  | 6       | 2017-11-27 |
| 0     |         | 2017-12-04 |
| 0     |         | 2017-12-11 |
| 0     |         | 2017-12-18 |
| 0     |         | 2017-12-25 |
+-------+---------+------------+

However, this is difficult to parse particularly when there's no data for a week. 但是,这很难解析,尤其是当一个星期没有数据时。 What I would like is a pivot or crosstab table where the weeks are the columns and the rows are the users. 我想要的是数据透视表或交叉表,其中周是列,行是用户。 And to include nulls from each (for instance if a user had no entries in that week or week without entries from any user). 并包括每个字段的空值(例如,如果用户在该周或该周内没有任何条目,而没有任何用户的条目)。

Something like this 像这样

+---------+---------------+--------------+--------------+
| user_id | 2017-10-30    | 2017-11-06   | 2017-11-13   |
+---------+---------------+--------------+--------------+
| 6       | 4.0           | 1.0          | 0            |
| 7       | 4.0           | 1.0          | 0            |
| 8       | 4.0           | 0            | 0            |
| 9       | 0             | 1.0          | 0            |
| 10      | 4.0           | 0.04         | 0            |
+---------+---------------+--------------+--------------+

I've been looking around online and it seems that "dynamically" generating a list of columns for crosstab is difficult . 我一直在网上浏览,似乎很难 “动态地”生成交叉表的列列表。 I'd rather not hard code them, which seems weird to do anyway for dates. 我宁愿不对它们进行硬编码,对于日期无论如何都似乎很奇怪。 Or use something like this case with week number . 或者使用类似这种情况的星期数

Should I look for another solution besides crosstab? 除了交叉表外,我还应该寻找其他解决方案吗? If I could get the series of weeks for each user including all nulls I think that would be good enough. 如果我可以为每个用户获得包括所有空值在内的一系列星期,那么我认为这已经足够了。 It just seems that right now my join strategy isn't returning that. 似乎现在我的加入策略还没有返回。

Personally I would use a Date Dimension table and use that table as the basis for the query. 我个人将使用“日期维度”表并将该表用作查询的基础。 I find it far easier to use tabular data for these types of calculations as it leads to SQL that's easier to read and maintain. 我发现将表格数据用于这些类型的计算要容易得多,因为它会导致SQL易于阅读和维护。 There's a great article on creating a Date Dimension table in PostgreSQL at https://medium.com/@duffn/creating-a-date-dimension-table-in-postgresql-af3f8e2941ac , though you could get away with a much simpler version of this table. 在PostgreSQL上有一篇很棒的文章,关于在PostgreSQL中创建日期维度表, 网址https://medium.com/@duffn/creating-a-date-dimension-table-in-postgresql-af3f8e2941ac ,尽管您可以使用更简单的版本这张桌子

Ultimately what you would do is use the Date table as the base for the SELECT cols FROM table section and then join against that, or probably use Common Table Expressions, to create the calculations. 最终,您要做的是将“日期”表用作SELECT cols FROM table部分的基础,然后将其SELECT cols FROM table ,或者可能使用“公用表表达式”来创建计算。

I'll write up a solution to that if you would like demonstrating how you could create such a query. 如果您想演示如何创建这样的查询,我将写一个解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM