[英]MySQL getting duplicates from select distinct + union
当我在MySQL中运行以下查询时,我得到了很多重复。 我知道我已经非常清楚,我只需要不同的记录,所以我无法理解为什么它会为我加倍。 当我包含最后一个联合( importorders
表)时,似乎所有重复都会出现,因为大多数客户在客户和订单中具有相同的地址。 任何人都可以帮助我理解为什么会这样吗?
SELECT DISTINCT PostalCode, City, Region, Country
FROM
(select distinct postalcode, city, region, country
from importemployees
UNION
select distinct postalcode, city, region, country
from importcustomers
UNION
select distinct postalcode, city, region, country
from importproducts
UNION
select distinct shippostalcode as postalcode, shipcity as city, shipregion as region, shipcountry as country
from importorders) T
如你看到的。 有些行是重复的。
如果我首先使用INSERT IGNORE
插入importcustomers
,然后使用importorders
,那么它会设法将记录标识为重复。 为什么选择查询不起作用?
非常奇怪的问题。 当我放弃'国家'时似乎解决了这个问题。
SELECT DISTINCT PostalCode, City, Region
共128次,查询耗时0.0066秒
SELECT DISTINCT PostalCode, City, Region, Country
209总计,查询耗时0.0002秒
此外,该行为似乎只影响ImportCustomers
和ImportOrders
:
SELECT postalcode, city, region, country
FROM
(SELECT postalcode, city, region, country FROM importcustomers
UNION
SELECT shippostalcode, shipcity, shipregion, shipcountry FROM importorders) t
总计172,查询花了0.0053秒
SELECT postalcode
FROM
(SELECT postalcode FROM importcustomers
UNION
SELECT shippostalcode FROM importorders) t
共计91次,查询耗时0.0050秒
然后我把它缩小到了importcusotmers
和importorders
的country
专栏
SELECT TRIM(country) AS country FROM importcustomers
UNION
SELECT TRIM(shipcountry) AS country FROM importorders
Argentina Argentina Austria Austria Belgium Belgium ...
当我将列转换为BINARY
时发生了一些有趣的事情
SELECT BINARY country AS country FROM importcustomers
UNION
SELECT BINARY shipcountry AS country FROM importorders
Argentina 417267656e74696e610d Austria 417573747269610d Belgium 42656c6769756d0d ...
表ImportOrders
导致重复。
SELECT BINARY shipcountry AS country FROM importorders
4765726d616e790d 5553410d 5553410d 4765726d616e790d ...
查看您提供的转储,在国家/地区的末尾附加了一个额外的\\r
(在值中用0d
表示)。
-- -- Dumping data for table `importorders` -- INSERT INTO `importorders` VALUES ...'Germany\r'), ...'USA\r'), ...'USA\r'), ...'Germany\r'), ...'Mexico\r'),
在importcustomers
该country
看起来很好:
-- -- Dumping data for table `importcustomers` -- INSERT INTO `importcustomers` VALUES ...'Germany', ... , ...'Mexico', ... , ...'Mexico', ... , ...'UK', ... , ...'Sweden', ... ,
您可以通过运行此查询来删除这些\\r
的(回车):
UPDATE importorders SET ShipCountry = REPLACE(ShipCountry, '\r', '')
如果运行原始查询,则可以获得所需的结果集。 仅供参考,如果您使用的是UNION
,则不需要DISTINCT
。
SELECT PostalCode, City, Region, Country
FROM
(SELECT postalcode, city, region, country FROM importemployees
UNION
SELECT postalcode, city, region, country FROM importcustomers
UNION
SELECT postalcode, city, region, country FROM importproducts
UNION
SELECT shippostalcode as postalcode, shipcity as city,
shipregion as region, shipcountry as country FROM importorders) T
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.