简体   繁体   English

如何计算 SQL 中每个分区的平均省略重复项?

[英]How to calculate average omitting duplicates per partition in SQL?

I want to calculate the average item count accounting for sub-partitions in each partition.我想计算每个分区中子分区的平均项目数。

Sample Data:样本数据:

id     session      item_count    random_field_1 
1      weoifn2      3             A
1      weoifn2      3             B
1      iuboiwe      2             K
2      oeino33      5             R
2      vergeeg      8             C
2      feooinn      9             P
2      feooinn      9             M

Logic:逻辑:

  • id = 1: (3 + 2) / 2 = 2.5 id = 1: (3 + 2) / 2 = 2.5
  • id = 2: (5 + 8 + 9) / 3 = 7.33 id = 2: (5 + 8 + 9) / 3 = 7.33

Expected Output:预期 Output:

id      avg
1       2.5
2       7.33

My Query:我的查询:

SELECT 
   id
 , AVG(item_count) OVER (PARTITION BY id) AS avg
FROM my_table

However, I believe this will factor in duplicates twice, which is unintended.但是,我相信这会导致重复两次,这是无意的。 How can I fix my query to only consider one item_count value per session?如何修复我的查询以仅考虑每个 session 的一个item_count值?

Consider below approach考虑以下方法

select id, avg(item_count) as avg
from (
  select distinct id, session, item_count
  from your_table
)
group by id           

if applied to sample data in your question - output is如果应用于您问题中的示例数据 - output 是

在此处输入图像描述

SELECT id, AVG(item_count) OVER (PARTITION BY id) AS avg
FROM (
  SELECT
    id
    , CASE
        WHEN ROW_NUMBER OVER (PARTITION BY id) = 1
          THEN item_count
        ELSE NULL
      END 
        AS item_count
  FROM my_table
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM