简体   繁体   English

使用IN与group_concat结果进行MySQL查询

[英]Mysql query using IN with group_concat result

I'm trying to clean a db with duplicate records. 我正在尝试清除具有重复记录的数据库。 I need to move the reference to a single record and delete the other one. 我需要将引用移至单个记录并删除另一个记录。

I have two tables: Promoters and Venues, each has a reference to a table called cities. 我有两个表:Promoters和Venues,每个表都引用了一个称为city的表。 The problem is that there are cities with the same name and different ids, that have a relation with venues and promoters. 问题在于,有些城市具有相同的名称和不同的ID,并且与场馆和发起人有关。

With this query I can group all promoters and venues with a single city record: 通过此查询,我可以将所有发起人和场馆归为一个城市记录:

SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids, GROUP_CONCAT( DISTINCT v.id ) as venues_ids
FROM cities as c
LEFT JOIN promoters as p ON p.city_id = c.id
LEFT JOIN venues as v ON v.city_id = c.id
WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 )
GROUP BY c.name

Now I want to run an UPDATE query on promoters, setting the city_id equals to the result of the query above. 现在,我要在启动器上运行UPDATE查询,将city_id设置为等于上述查询的结果。

Something like this: 像这样:

    UPDATE promoters AS pr SET pr.city_id = (
        SELECT ID
        FROM (
            SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids
            FROM cities as c
            LEFT JOIN promoters as p ON p.city_id = c.id

            WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 ) AND pr.id IN promoters_ids
            GROUP BY c.name
            ) AS T1 

    )

How can I do this? 我怎样才能做到这一点?

Thanks 谢谢

If I understand correctly, you want to remove duplicate cities (in the end), so you need to update promoters that are linked to any of the cities you want to remove in that process. 如果我理解正确,那么您将要删除重复的城市(最后),因此您需要更新与该过程中要删除的任何城市链接的启动程序。

I think it makes sense to use the lowest ID of any of the cities with the same name (could be the highest just as well, but I want to specify it at least, and don't leave it up to me. 我认为在所有具有相同名称的城市中使用最低ID(也可以是最高ID)是有道理的,但我至少要指定它,不要把它留给我。

So in order get the right ID for a promoter, I need to: Select the lowest ID of all cities that have the same name as the city already linked to a promoter. 因此,为了获得正确的发起人ID,我需要:选择与已链接到发起人的城市同名的所有城市中的最低ID。

Fortunately, that demand fits snuggly into a query: 幸运的是,该需求非常适合查询:

UPDATE promoters AS pr 
SET pr.city_id = (
  SELECT 
    -- Select the lowest ID ..
    Min(c.id)
  FROM
    -- .. of all cities ..
    Cities c
    -- .. that have the same name ..
    INNER JOIN Cities pc on pc.Name = c.Name
  WHERE
    .. as the city already linked to the promoter being updated
    pc.id = pr.city_id
  GROUP BY
    c.name)

The trick is to join Cities on itself by name, so you can easily get all cities with the same name. 诀窍是通过名称本身加入城市,这样您就可以轻松获得所有具有相同名称的城市。 I think you tried the same with the IN clause, but that's a little more complex than it needs to be. 我认为您对IN子句进行了相同的尝试,但这比所需的要复杂一些。

I don't think you need group_concat at all, besides checking if the inned query returns the correct cities indeed, although it doesn't make sense, since you're already grouping on the name. 我认为您根本不需要group_concat ,除了检查插入查询是否确实返回了正确的城市外,尽管这没有任何意义,因为您已经在名称上进行了分组。 When written like this, you can tell that there should be no way that this can go wrong: 当这样写时,您可以告诉我们这不可能出错:

  SELECT 
    -- Select the lowest ID ..
    MIN(c.id) AS id,
    GROUP_CONCAT(c.name) AS names --< already grouped by this, so why...
  FROM
    -- .. of all cities ..
    Cities c
    -- .. that have the same name.
    INNER JOIN Cities pc on pc.Name = c.Name
  GROUP BY
    c.name

I hope I understood the question correctly. 我希望我正确理解了这个问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM