简体   繁体   English

使用Postgres汇总列值和按月分组日期

[英]Sum column value and group dates by month with Postgres

I've got a table in my Postgres DB that looks something like this: 我的Postgres DB中有一个表,看起来像这样:

date          duration
2018-05-10      10
2018-05-12      15
2018-06-01      10
2018-06-02      20
2019-01-01      5
2019-01-02      15
2019-04-01      10

And I wish to sum the values for each month and group them by year, month and, month number into something like this: 我希望将每个月的值相加,然后按年,月和月的数字将它们分组为如下所示:

year    month    month_number   monthly_sum
2018    May         5              25
2018    June        6              30
2019    Jan         1              20
2019    Apr         4              10

And ended up with a query that looks like this: 并以如下查询结束:

SELECT 
  to_char(date_trunc('month', date), 'YYYY') AS year,
  to_char(date_trunc('month', date), 'Mon') AS month,
  to_char(date_trunc('month', date), 'MM') AS month_number,
  sum(duration) AS monthly_sum
FROM timesheet 
GROUP BY year, month, month_number

And it works just fine, my question is: is this query considered bad? 而且效果很好,我的问题是:这个查询被认为是不好的吗? Will it affect performance if I have like.. 100k rows? 如果有10万行,它会影响性能吗? I heard using to_char is inferior to date_trunc, which is what I tried to avoid here, I just wrapped the date_trunc in a to_char. 我听说使用to_char不如date_trunc,这是我在这里要避免的方法,我只是将date_trunc包装在to_char中。 Also, having three values in a GROUP BY clause, does it affect anything? 另外,在GROUP BY子句中具有三个值,这会影响什么吗?

using functions and grouping using them accordingly may degrade performance. 使用功能并相应地使用功能进行分组可能会降低性能。 It is preferable to have Calendar table with proper indexes for this purpose, so that you won't need to deal with such issues on every table. 为此,最好使Calendar表具有适当的索引,这样您就不必在每个表上都处理此类问题。

Check This and this (Calendar Table) 选中此此(日历表)

The query is not bad, but you can simplify it. 该查询还不错,但是您可以简化它。

SELECT to_char(date_trunc('month', date), 'YYYY') AS year,
       to_char(date_trunc('month', date), 'Mon') AS month,
       to_char(date_trunc('month', date), 'MM') AS month_number,
       sum(duration) AS monthly_sum
FROM timesheet 
GROUP BY date_trunc('month', date);

From a performance perspective, shorter GROUP BY keys would have a small impact on performance, but that is not something I would worry about. 从性能的角度来看,较短的GROUP BY键对性能的影响很小,但是我不必担心。

Since your query does not have any filtering condition, it's always reading all the rows of the table: this is the major impact in performance. 由于查询没有任何过滤条件,因此它将始终读取表的所有行:这是对性能的主要影响。 Had you had filtering conditions, you could be better of having the right indexes. 如果您有过滤条件,则最好具有正确的索引。

Having said that, the way you are extracting years and months could be marginally improved as other answers here show, but that will have little impact in the performance of the query. 话虽如此,您提取年份和月份的方式可能会有所改善,如此处的其他答案所示,但这对查询性能几乎没有影响。

In sum, in the absence of filtering conditions your query is close to optimal. 总之,在没有过滤条件的情况下,您的查询已接近最佳状态。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM