[英]Subquery yields different results when used alone
I have to write a query across two different tables country
and city
. 我必须在两个不同的表country
和city
之间编写查询。 The goal is to get every district and that district's population for every country. 目标是获得每个地区以及每个国家/地区的人口。 As the district is just an attribute of each city, I have to subsume all the populations of every city belonging to a district. 由于地区只是每个城市的属性,因此我必须将每个城市的所有人口都归入一个地区。
My query so far looks like this: 到目前为止,我的查询如下所示:
SELECT country.name, country.population, array_agg(
(SELECT (c.district, sum(city.population))
FROM city GROUP BY c.district))
AS districts
FROM country
FULL OUTER JOIN city c ON country.code = c.countrycode
GROUP BY country.name, country.population;
The result: 结果:
name | population | districts
---------------------------------------------+------------+------------------------------------------------------------------------------------------------------------------
Afghanistan | 22720000 | {"(Balkh,1429559884)","(Qandahar,1429559884)","(Herat,1429559884)","(Kabol,1429559884)"}
Albania | 3401200 | {"(Tirana,1429559884)"}
Algeria | 31471000 | {"(Blida,1429559884)","(Béjaïa,1429559884)","(Annaba,1429559884)","(Batna,1429559884)","(Mostaganem,1429559884)"
American Samoa | 68000 | {"(Tutuila,1429559884)","(Tutuila,1429559884)"}
So apparently it sums all the city-populations of the world. 因此,显然,它汇总了世界上所有城市的人口。 I need to limit that somehow to each district alone. 我需要以某种方式将其限制在每个地区。
But if I run the Subquery alone as 但是如果我单独运行子查询
SELECT (city.district, sum(city.population)) FROM city GROUP BY city.district;
it gives me the districts with their population: 它为我提供了人口众多的地区:
row
----------------------------------
(Bali,435000)
(,4207443)
(Dnjestria,194300)
(Mérida,224887)
(Kochi,324710)
(Qazvin,291117)
(Izmir,2130359)
(Meta,273140)
(Saint-Denis,131480)
(Manitoba,618477)
(Changhwa,354117)
I realized it has to do something with the abbreviation that I use when joining. 我意识到它必须与加入时使用的缩写有关。 I used it for convenience but it seems to have real consequences because if I don't use it, it gives me the error 我使用它是为了方便,但它似乎有实际的后果,因为如果我不使用它,它将给我错误
more than one row returned by a subquery used as an expression
Also, if I use 另外,如果我使用
sum(c.population)
in the subquery it won't execute because 在子查询中将不会执行,因为
aggregate function calls cannot be nested
This abbreviation when joining apparently changes a lot . 加入时的缩写显然变化很大 。
I hope someone can shed some light on that. 我希望有人能对此有所启发。
Solved it myself. 我自己解决了。
Window functions are the most convenient method for this kind of task: 窗口函数是完成此类任务的最便捷方法:
SELECT DISTINCT
country.name
, country.population
, city.district
, sum(city.population) OVER (PARTITION BY city.district)
AS district_population
, sum(city.population) OVER (PARTITION BY city.district)/ CAST(country.population as float)
AS district_share
FROM
country JOIN city ON country.code = city.countrycode
;
But it also works with subselects: 但它也适用于子选择:
SELECT DISTINCT
country.name
, country.population
, city.district
,(
SELECT
sum(ci.population)
FROM
city ci
WHERE ci.district = city.district
) AS district_population
,(
SELECT
sum(ci2.population)/ CAST(country.population as float)
FROM
city ci2
WHERE ci2.district = city.district
) AS district_share
FROM
country JOIN city ON country.code = city.countrycode
ORDER BY
country.name
, country.population
;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.