简体   繁体   English

针对多个类别中的项目的总价格分组

[英]Group by for total price against item in multiple categories

Consider the following data:考虑以下数据:

Items | Price | Categories
--------------------------
Item1 |    10 | Cat1, Cat2
Item2 |    20 | Cat1, Cat3
Item3 |    15 | Cat1, Cat2
--------------------------
Total |    45

If I group the data according to Categories, following stats come up如果我根据类别对数据进行分组,则会出现以下统计信息

Categories | Price
------------------
Cat1       |    45
Cat2       |    25
Cat3       |    20
------------------
Total      |    90

Now if I have to sum up the prices, Actual sum will be 45 but the sum according to categories shown will be different ie 90 .现在,如果我必须总结价格,实际总和将为45但根据显示的类别的总和将不同,即90 So both representations say different stats but they are correct in their own way.因此,两种表示都表示不同的统计数据,但它们以自己的方式是正确的。

Looking for answer to this question: How would I represent such a stat.寻找这个问题的答案:我将如何表示这样的统计数据。 For example the total on Top says 45 but the sum across the categories says 90. I mean, isn't it confusing for the end-user?例如,Top 上的总数为 45,但各个类别的总和为 90。我的意思是,这不会让最终用户感到困惑吗?

Any suggestions of how to tackle this problem other than choosing one category per item?除了为每个项目选择一个类别之外,关于如何解决这个问题的任何建议?

Below is for BigQuery Standard SQL以下是 BigQuery 标准 SQL

#standardSQL
SELECT category, SUM(price) AS price FROM (
  SELECT category, price FROM `project.dataset.table`, 
  UNNEST(SPLIT(categories, ', ')) category UNION ALL
  SELECT 'Total', price FROM `project.dataset.table`
)
GROUP BY category

If to apply to sample data from your question as in below example如果适用于您的问题中的示例数据,如下例所示

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'Item1' AS item, 10 AS price, 'Cat1, Cat2'  AS categories UNION ALL
  SELECT 'Item2', 20, 'Cat1, Cat3' UNION ALL
  SELECT 'Item3', 15, 'Cat1, Cat2' 
)
SELECT category, SUM(price) AS price FROM (
  SELECT category, price FROM `project.dataset.table`, 
  UNNEST(SPLIT(categories, ', ')) category UNION ALL
  SELECT 'Total', price FROM `project.dataset.table`
)
GROUP BY category   

result is结果是

Row category    price    
1   Cat1        45   
2   Cat2        25   
3   Cat3        20   
4   Total       45     

with correct Total value具有正确的总值

You can split the data and then re-aggregate:您可以拆分数据,然后重新聚合:

with t as (
      select 'Item1' as item, 10 as price, 'Cat1, Cat2'  as categories union all
      select 'Item2', 20, 'Cat1, Cat3' union all
      select 'Item3', 15, 'Cat1, Cat2' union all
      select 'Total', 45, NULL
     )
select coalesce(category, 'Total'), sum(price)
from t cross join
     unnest(split(t.categories, ', ')) category
group by rollup(category)
order by category nulls last;

If you don't need the total, then remove the rolllup .如果您不需要总数,请删除rolllup

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM