简体   繁体   English

如何编写以下涉及子查询的 SQL 查询

[英]how to write the following SQL query involving sub queries

I have the following table named population :我有下表名为population

╔════════════╦════════════╦════════════════╗
║     india  ║ hyderabad  ║          50100 ║
║     india  ║ delhi      ║          75000 ║
║     USA    ║ NewYork    ║          25000 ║
║     USA    ║ california ║          30000 ║
║     india  ║  delhi     ║           5000 ║
║     USA    ║  NewYork   ║          75000 ║
╚════════════╩════════════╩════════════════╝

I need to write a SQL query to get data in the following format:我需要编写一个 SQL 查询来获取以下格式的数据:

╔════════╦═════════╦══════════╗
║ india  ║ delhi   ║    80000 ║
║ USA    ║ NewYork ║   100000 ║
╚════════╩═════════╩══════════╝

country name and the city with the highest population where multiple entries of the cities are summed up.国家名称和人口最多的城市,城市的多个条目汇总在一起。

You can use:您可以使用:

SELECT *
FROM (
  SELECT country,city, SUM(pop) AS total
  FROM population 
  GROUP BY country,city) AS sub
WHERE (country, total) IN (
                           SELECT country, MAX(total)
                           FROM (SELECT country,city, SUM(pop) AS total
                                 FROM population 
                                 GROUP BY country,city
                             ) as s
                           GROUP BY country
                           );

If two cities in the same country have the same highest total population you will get two cities for that country.如果同一国家/地区的两个城市的总人口最多,那么您将获得该国家/地区的两个城市。

SqlFiddleDemo

Output:输出:

╔══════════╦═════════╦════════╗
║ country  ║  city   ║ total  ║
╠══════════╬═════════╬════════╣
║ india    ║ delhi   ║  80000 ║
║ USA      ║ NewYork ║ 100000 ║
╚══════════╩═════════╩════════╝

You could use a combination of GROUP_CONCAT and FIND_IN_SET.您可以组合使用 GROUP_CONCAT 和 FIND_IN_SET。 This query will return a comma separated list of cities for every country, ordered by population DESC:此查询将返回每个国家/地区的逗号分隔城市列表,按人口 DESC 排序:

SELECT country, GROUP_CONCAT(city ORDER BY pop DESC) AS cities
FROM population
GROUP BY country

and it will return something like this:它将返回如下内容:

| country |                   cities |
|---------|--------------------------|
|   india |    delhi,hyderabad,delhi |
|     USA | NewYok,california,NewYok |

then we can join this subquery back to the population table using FIND_IN_SET that returns the position of a city in the list of cities:然后我们可以使用返回城市列表中城市位置的 FIND_IN_SET 将此子查询连接回人口表:

SELECT
  p.country,
  p.city,
  SUM(p.pop)
FROM
  population p INNER JOIN (
    SELECT country, GROUP_CONCAT(city ORDER BY pop DESC) AS cities
    FROM population
    GROUP BY country
  ) m ON p.country=m.country
         AND FIND_IN_SET(p.city, m.cities)=1
GROUP BY
  p.country,
  p.city

the join will succeed only on the city with the maximum population for every country: FIND_IN_SET(p.city, m.cities)=1 .连接只会在每个国家/ FIND_IN_SET(p.city, m.cities)=1人口最多的城市上成功: FIND_IN_SET(p.city, m.cities)=1

This will work only if there's one city with the maximum poluation, if there are more only one will be returned.这仅在有一个城市的最大污染时才有效,如果有更多,则只会返回一个。 This also is not standard SQL and will only work on MySQL or similar, other DBMS have window functions that will make this same query easier to write.这也不是标准 SQL,只能在 MySQL 或类似的东西上工作,其他 DBMS 具有窗口函数,可以使相同的查询更容易编写。

The following answer is not correct as it uses a feature specific to Mysql which violates the ANSI standards.以下答案不正确,因为它使用了违反 ANSI 标准的特定于 Mysql 的功能。 The result is not deterministic as it is not defined which city name will be returned when aggregating by country.结果是不确定的,因为它没有定义按国家聚合时将返回哪个城市名称。 Mostly it is the first entry which will be used, this is why sorting in the inner query makes this work in most cases.大多数情况下,它是将使用的第一个条目,这就是为什么在内部查询中进行排序在大多数情况下都可以工作的原因。 But beware: It is by definition not guaranteed to use the first city, hence there can be cases where this will output wrong results.但要注意:根据定义,不能保证使用第一个城市,因此在某些情况下可能会输出错误的结果。 Another case this answer does not cover, is when there are two cities with same population as max for a country.此答案未涵盖的另一种情况是,当有两个城市的人口与一个国家的最大值相同时。 This solution will only output one city per country.该解决方案将仅输出每个国家/地区的一个城市。

I would solve it with a inner subquery which gets all cities grouped and the outer filters only to get the largest by country.我会用一个内部子查询来解决它,该子查询将所有城市分组,而外部过滤器仅获得按国家/地区划分的最大城市。

SELECT 
  country, city, MAX(population_total) AS population_total
FROM 
  (
        SELECT country, city, SUM(population) AS population_total
        FROM tableName
        GROUP BY country, city
        ORDER BY population_total DESC
  ) AS t1
GROUP BY 
  country

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM