简体   繁体   English

如何在 postgresql 中按周分组并按天区分

[英]How to group by week and distinct by day in postgresql

Sample contents are:样例内容为:

id ID created_dt创建日期 data数据
1 1个 2023-01-14 11:52:41 2023-01-14 11:52:41 {"customers": 1, "payments: 2} {“客户”:1,“付款:2}
2 2个 2023-01-15 11:53:43 2023-01-15 11:53:43 {"customers": 1, "payments: 2} {“客户”:1,“付款:2}
3 3个 2023-01-18 11:51:45 2023-01-18 11:51:45 {"customers": 1, "payments: 2} {“客户”:1,“付款:2}
4 4个 2023-01-15 11:50:48 2023-01-15 11:50:48 {"customers": 1, "payments: 2} {“客户”:1,“付款:2}

ID 4 or 2 should be distinct. ID 4 或 2 应该是不同的。

I want to get a result as follows:我想得到如下结果:

year week星期 customers顾客 payments付款
2023 2023年 2 2个 2 2个 4 4个
2023 2023年 3 3个 1 1个 2 2个

I solved this problem in this way我用这种方式解决了这个问题

SELECT
    date_part('year', sq.created_dt) AS year,
    date_part('week', sq.created_dt) AS week,
    sum((sq.data->'customers')::int) AS customers,
    sum((sq.data->'payments')::int) AS payments
FROM 
    (SELECT DISTINCT ON (created_dt::date) created_dt, data 
     FROM analytics) sq
GROUP BY 
    year, week
ORDER BY 
    year, week;

However, that subquery greatly complicates the query.但是,该子查询极大地使查询复杂化。 Is there is a better method?有没有更好的方法?

I need group the data by each week, however I also need to remove duplicate days.我需要每周对数据进行分组,但是我还需要删除重复的日期。

Generate series to create the join table would solve the problem:生成系列以创建连接表可以解决问题:

SELECT sum((sq.data->'customers')::int) as customers,
sum((sq.data->'payments')::int) as payments,
date_part('year', dategroup ) as year,
date_part('week', dategroup ) as week,
FROM generate_series(current_date , current_date+interval '1 month' , interval'1 week') AS dategroup
JOIN analytics AS a ON a.created_dt >= dategroup AND a.created_dt <= a.created_dt+interval '1 week'
GROUP BY dategroup
ORDER BY dategroup

First of all, I think your query is quite simple and understandable.首先,我认为您的查询非常简单易懂。

Here is the query with a with -query in it, in some point it adds more readabilty:这是带有with -query 的查询,在某些时候它增加了更多的可读性:

WITH unique_days_data AS (
  SELECT DISTINCT created_dt::date, data_json
  FROM analytics)
SELECT 
    date_part('year', ud.created_dt) as year,
    date_part('week', ud.created_dt) as week,
    sum((ud.data_json->'customers')::int) as customers,
    sum((ud.data_json->'payments')::int) as payments
FROM unique_days_data ud
GROUP BY year, week
ORDER BY year, week;

The difference is that the first query uses the DISTINCT clause, not the DISTINCT ON clause.区别在于第一个查询使用的是DISTINCT子句,而不是DISTINCT ON子句。

Here is the sql fiddle .这是sql 小提琴

You can simplify it by adding partitioning on " created_id::date ", then filter last aggregated record for each week using FETCH FIRST n ROWS WITH TIES .您可以通过在“ created_id::date ”上添加分区来简化它,然后使用FETCH FIRST n ROWS WITH TIES过滤每周的最后聚合记录。

SELECT date_part('year', created_dt) AS year,
       date_part('week', created_dt) AS week,
       SUM((data->>'customers')::int) AS customers,
       SUM((data->>'payments')::int) AS payments
FROM analytics
GROUP BY year, week, created_dt::date
ORDER BY ROW_NUMBER() OVER(
             PARTITION BY date_part('week', created_dt) 
             ORDER     BY created_dt::date DESC
         )
FETCH FIRST 1 ROWS WITH TIES

Check the demo here .此处查看演示。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM