简体   繁体   English

汇总-按小计金额过滤行

[英]Rollup - Filter rows by subtotal amounts sql oracle

Supposing I have a table and query: 假设我有一个表和查询:

consisting of population for a given country in a given continent for a given year. 由特定国家/地区在特定年份的特定国家/地区的人口组成。
i want to return countries avg(population) & the continents avg population if the country's population is greater than the continents +3 Basically I want to filter out rows that are a certain difference away from the subtotal continent value. 如果该国家的人口大于各大洲+3,我想返回该国家的平均(人口)和各大洲的平均人口数。基本上,我想过滤出与小计总洲值有一定差异的行。

I modified this and realize the data does not have multiple years and that the numbers are obviously garbage, but this is just an example. 我对此进行了修改,并意识到数据没有多年,而且这些数字显然是垃圾,但这只是一个例子。

 create table abc (continent varchar2(30), country varchar2(30), population number,   yr number)
 insert into abc values ('africa', 'kenya', 50, 2005)
 insert into abc values ('africa', 'egypt', 100, 2006)
 insert into abc values('africa', 'south africa', 35, 2007)
 insert into abc values ('africa', 'nigeria', 200, 2008)
 insert into abc values ('asia', 'china', 50, 2005)
 insert into abc values ('asia', 'india', 100, 2006)
 insert into abc values('asia', 'japan', 35, 2007) 
 insert into abc values ('asia', 'korea', 200, 2008)


 select continent, country, avg(population)
 from abc

 where ------population for each country > 3+ avg for each continent
 ----should return egpyt/nigeria rows and india/korea rows since average here is   96.25 for each continent.
 group by rollup(continent, country)

So, with the definition of continent average as simply being the average of all rows with that continent, a solution can be: 因此,将大洲平均值的定义简单地定义为该大洲所有行的平均值,则解决方案可以是:

select continent
     , country
     , avg(population) country_avg
     , max(continent_avg) continent_avg
  from (
   select continent
        , country
        , population
        , avg(population) over (
             partition by continent
          ) continent_avg
     from abc
  )
 group by continent, country
having avg(population) > max(continent_avg) + 3
 order by continent, country;

The reason I asked about the definition of continent average is, that if some countries within a continent have more rows in the table (=more years), those countries will weigh more in the average calculated like that. 我问有关大陆平均值的定义的原因是,如果一个大陆中的某些国家/地区表中的行数更多(=更长的年份),则这些国家在这样计算的平均值中的权重会更高。 Then an alternative can be that the continent average is the average of the country averages, in which case a solution can be: 然后可以选择一种方法,即大陆平均值是国家平均值的平均值,在这种情况下,解决方案可以是:

select *
  from (
   select continent
        , country
        , avg(population) country_avg
        , avg(avg(population)) over (
             partition by continent
          ) continent_avg
     from abc
    group by continent, country
  )
 where country_avg > continent_avg + 3;

If the countries all have the same number of years (same number of rows), the two solutions ought to give the same result. 如果所有国家/地区的年数相同(行数相同),则两种解决方案应得出相同的结果。 But if countries can have different number of years, you will have to pick the solution that fits your requirements. 但是,如果国家/地区的年限可以不同,则必须选择适合您要求的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM