简体   繁体   中英

MySQL: why do I need an extra SELECT with “WHERE id NOT IN(subquery1 UNION subquery 2)”

I have a table similar to this simplified version:

CREATE TABLE `accounts` (
  `id` int(11) NOT NULL,
  `account_type_id` int(10) NOT NULL,
  `type` varchar(10) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

INSERT INTO `accounts` VALUES (1,1,'single'),(2,1,'single'),(3,1,'single'),(4,1,'single'),(5,1,'single'),(6,1,'single'),(7,1,'single'),(8,1,'single'),(9,1,'single'),(10,2,'single'),(11,2,'single'),(12,2,'single'),(13,2,'single'),(14,2,'single'),(15,2,'single'),(16,2,'single'),(17,2,'single'),(18,2,'single'),(19,2,'single'),(20,2,'single'),(21,1,'joint'),(22,1,'joint'),(23,1,'joint'),(24,1,'joint'),(25,1,'joint'),(26,1,'joint'),(27,1,'joint'),(28,1,'joint'),(29,1,'joint'),(30,1,'joint'),(31,2,'joint'),(32,2,'joint'),(33,2,'joint'),(34,2,'joint'),(35,2,'joint'),(36,2,'joint'),(37,2,'joint'),(38,2,'joint'),(39,2,'joint'),(40,2,'joint'),(41,3,'single'),(42,3,'single'),(43,3,'single'),(44,3,'single'),(45,3,'single'),(46,3,'single'),(47,3,'single'),(48,3,'single'),(49,3,'single'),(50,3,'single'),(51,3,'single'),(52,3,'single'),(53,3,'single'),(54,3,'single'),(55,3,'single'),(56,3,'single'),(57,3,'single'),(58,3,'single'),(59,3,'single'),(60,3,'single'),(61,3,'joint'),(62,3,'joint'),(63,3,'joint'),(64,3,'joint'),(65,3,'joint'),(66,3,'joint'),(67,3,'joint'),(68,3,'joint'),(69,3,'joint'),(70,3,'joint'),(71,3,'joint'),(72,3,'joint'),(73,3,'joint'),(74,3,'joint'),(75,3,'joint'),(76,3,'joint'),(77,3,'joint'),(78,3,'joint'),(79,3,'joint'),(80,3,'joint');

I want to keep:

  • random 5x type = single, account_type_id = 1 or 2
  • random 5x type = joint, account_type_id = 1 or 2
  • random 5x type = single, account_type_id = 3
  • random 5x type = joint, account_type_id = 3

My approach was to get the ids of 5 records matching each of the above, and then delete everything else.

(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'single' ORDER BY RAND() LIMIT 5)
  UNION
(SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
  UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'single' ORDER BY RAND() LIMIT 5)
  UNION
(SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'joint' ORDER BY RAND() LIMIT 5)

This correctly returns 5 ids of each required type. However, if I try and use that resultset directly in a WHERE id NOT IN (...) then I get an error (I've replaced DELETE with SELECT for the example):

SELECT * FROM accounts WHERE id NOT IN(
    (SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'single' ORDER BY RAND() LIMIT 5)
      UNION
    (SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
      UNION
    (SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'single' ORDER BY RAND() LIMIT 5)
      UNION
    (SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
);

Error Code: 1064. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'UNION   (SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'j' at line 3

If I then add an intermediary subquery as follows:

SELECT * FROM accounts WHERE id NOT IN(
    SELECT a.id FROM (
        (SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'single' ORDER BY RAND() LIMIT 5)
          UNION
        (SELECT id FROM accounts WHERE account_type_id IN (1, 2) AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
          UNION
        (SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'single' ORDER BY RAND() LIMIT 5)
          UNION
        (SELECT id FROM accounts WHERE account_type_id = 3 AND `type` = 'joint' ORDER BY RAND() LIMIT 5)
    ) a
);

I get the result I want... please could someone explain why the extra query is necessary?

if you say NOT IN means

id not in field set (1 ,2,3,4,5 ,...)

in your query NOT IN and then it finds union queries , there is no set of values.

but if you make extra subquery which will select a.id is already a set a values of ids

then when you say NOT IN ( those ids ) it will return right result.

which you got what i mean.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM