如何使用 PostgreSQL 9.2 计算游戏中每个级别的百分位数

Question

I have a table of game logs.我有一张游戏日志表。 Like this:像这样：


Level Shuffle_Count
  1        3
  2        1
  2        2
  2        1
  3        0
  3        4

That means whenever a user plays a level, a row is added to table.这意味着每当用户玩一个关卡时，都会向表中添加一行。 These rows have the level data showing which level was played by user and the shuffle_count data showing how many times shuffle happened during that level.这些行具有显示用户播放的级别的级别数据和显示该级别期间 shuffle 发生的次数的 shuffle_count 数据。

I want to know how many times shuffle occurs in every level by calculating the median of shuffle_count for every level.我想通过计算每个级别的 shuffle_count 的中位数来知道每个级别发生了多少次 shuffle。 In the below code, I can find the median of level 2 separately.在下面的代码中，我可以分别找到第 2 级的中位数。 Firstly, I create a temporary table which orders shuffle_counts and divide them to 4 even groups with ntile.首先，我创建了一个临时表，它对 shuffle_counts 进行排序，并将它们分成 4 个带有 ntile 的偶数组。 Then I select the min shuffle_count which has value of 3 within the new column named quartile.然后我在名为 quartile 的新列中选择值为 3 的 min shuffle_count。

with ranked_test as (
    SELECT shuffle_count, ntile(4) OVER (ORDER BY shuffle_count) AS quartile FROM ch.public.game_log WHERE level = 2
)
SELECT min(shuffle_count) FROM ranked_test
WHERE quartile = 3
GROUP BY quartile;

This is the table created before selecting min shuffle_count where quartile = 3 (which is median approximately):这是在选择 min shuffle_count 之前创建的表，其中四分位数 = 3（大约是中位数）：

Shuffle_Count quartile
     0           1
     0           1
     2           2
     3           2
     4           3
     8           3
     12          4
     19          4

So far so good.到现在为止还挺好。 But the problem is that I have over 1000 levels and I can't do that manually for each level.但问题是我有 1000 多个级别，我无法为每个级别手动执行此操作。 I need the median value of shuffle_count for every level from 1 to 1000. I know this could be done with one row in PostgreSQL 9.4 but I unfortunately don't have that option right now.我需要从 1 到 1000 的每个级别的 shuffle_count 的中值。我知道这可以用 PostgreSQL 9.4 中的一行来完成，但不幸的是我现在没有那个选项。

I couldn't make this happen with a simple Group By.我无法通过简单的 Group By 实现这一点。 I guess I need more complex query including FOR or something.我想我需要更复杂的查询，包括 FOR 或其他东西。

Do you have any idea, guys?你有什么想法吗，伙计们？ Thanks in advance.提前致谢。

Answer 1

I think that this should do it for your use case:我认为这应该适用于您的用例：

with ranked_test as (
    select 
        level,
        shuffle_count, 
        ntile(4) over(partition by level order by shuffle_count) quartile 
    from ch.public.game_log
)
select level, quartile , min(shuffle_count) 
from ranked_test
where quartile = 3
group by level, quartile;

This is basically an extended version of your working query:这基本上是您的工作查询的扩展版本：

in the CTE, we remove the filter on level in the subquery, and add it to the partition by of the window function instead在 CTE 中，我们删除了子查询中的level过滤器，并将其添加到窗口函数的partition by中
the outer query, we add the level to the select and group by clause在外部查询中，我们将级别添加到select和group by子句中

如何使用 PostgreSQL 9.2 计算游戏中每个级别的百分位数

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-01-16 16:35:38

如何使用 PostgreSQL 9.2 计算游戏中每个级别的百分位数

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-01-16 16:35:38

解决方案1
2 已采纳 2020-01-16 16:35:38