简体   繁体   English

如何计算Postgres中的中位数?

[英]How to calculate the median in Postgres?

I have created a basic database (picture attached) Database , I am trying to find the following:我已经创建了一个基本数据库(附图片)数据库,我试图找到以下内容:

"Median total amount spent per user in each calendar month" “每个日历月每位用户花费的中位数总金额”

I tried the following, but getting errors:我尝试了以下操作,但出现错误:

SELECT 
user_id,
AVG(total_per_user)
FROM (SELECT user_id,
        ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
        ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
      FROM (SELECT EXTRACT(MONTH FROM created_at) AS calendar_month,
            user_id,    
            SUM(amount) AS total_per_user
            FROM transactions
            GROUP BY calendar_month, user_id) AS total_amount   
      ORDER BY user_id) AS a
WHERE asc_total IN (desc_total, desc_total+1, desc_total-1)
GROUP BY user_id
;

In Postgres, you could just use aggregate function percentile_cont() :在 Postgres 中,您可以只使用aggregate function percentile_cont()

select 
    user_id,
    percentile_cont(0.5) within group(order by total_per_user) median_total_per_user
from (
    select user_id, sum(amount) total_per_user
    from transactions
    group by date_trunc('month', created_at), user_id
) t
group by user_id

Note that date_trunc() is probably closer to what you want than extract(month from...) - unless you do want to sum amounts of the same month for different years together, which is not how I understood your requirement.请注意, date_trunc()可能比extract(month from...)更接近您想要的 - 除非您确实想将不同年份的同一个月的金额加在一起,这不是我理解您的要求的方式。

Just use percentile_cont() .只需使用percentile_cont() I don't fully understand the question.我不完全理解这个问题。 If you want the median of the monthly spending, then:如果您想要每月支出的中位数,则:

SELECT user_id,
       PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_per_user
        ROW_NUMBER() over (ORDER BY total_per_user DESC) AS desc_total,
        ROW_NUMBER() over (ORDER BY total_per_user ASC) AS asc_total
FROM (SELECT DATE_TRUNC('month', created_at) AS calendar_month,
             user_id, SUM(amount) AS total_per_user
      FROM transactions t
      GROUP BY calendar_month, user_id
     ) um   
GROUP BY user_id;

There is a built-in function for median.中位数有一个内置的 function。 No need for fancier processing.不需要更复杂的处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM