为什么MySQL'插入... select ...'比单独选择要慢得多？

Question

I'm trying to store a query result in a temporary table for further processing. 我正在尝试将查询结果存储在临时表中以供进一步处理。

create temporary table tmpTest
(
    a FLOAT,
    b FLOAT,
    c FLOAT
)
engine = memory;

insert into tmpTest
(
    select a,b,c from someTable
    where ...
);

But for some reason the insert takes up to a minute, whereas the subselect alone just takes a few seconds. 但由于某种原因，插入需要一分钟，而单独的子选择只需要几秒钟。 Why would it take so much longer to write the data to a temporary table instead of printing it to my SQL management tool's output??? 为什么将数据写入临时表而不是将其打印到我的SQL管理工具的输出需要这么长时间？

UPDATE My Setup: MySQL 7.3.2 Cluster with 8 Debian Linux ndb data nodes 1 SQL Node (Windows Server 2012) 更新我的设置：MySQL 7.3.2集群，带有8个Debian Linux ndb数据节点1个SQL节点（Windows Server 2012）

The table I'm running the select on is a ndb table. 我正在运行select on的表是一个ndb表。

I tried to find out, if the execution plan would differ when using 'insert into..', but they look the same: (sorry for the formatting, stackoverflow doesn't have tables) 我试图找出，如果执行计划在使用'insert into ..'时会有所不同，但它们看起来一样:(抱歉格式化，stackoverflow没有表格）

id  select_type     table       type    possible_keys   key     key_len ref                 rows        Extra
1   PRIMARY         <subquery3> ALL     \N              \N      \N      \N                  \N          \N
1   PRIMARY         foo         ref     PRIMARY         PRIMARY 3       <subquery3>.fooId   9747434     Using where
2   SUBQUERY        someTable   range   PRIMARY         PRIMARY 3       \N                  136933000   Using where with pushed condition; Using MRR; Using temporary; Using filesort
3   MATERIALIZED    tmpBar      ALL     \N              \N      \N      \N                  1000        \N

CREATE TABLE ... SELECT is slow, too. CREATE TABLE ... SELECT也很慢。 47 seconds vs. 5 seconds without table insert/create. 47秒与5秒没有表插入/创建。

Answer 1

I wrote a comment above, then stumbled across this as a workaround. 我在上面写了一条评论，然后偶然发现了这个问题。

This will accomplish what you want to do. 这将完成你想要做的事情。

SELECT * FROM aTable INTO OUTFILE '/tmp/atable.txt';
LOAD DATA INFILE '/tmp/atable.txt' INTO TABLE anotherTable;

Note that doing this means managing the /tmp tables in some way. 请注意，这样做意味着以某种方式管理/ tmp表。 If you try to SELECT data into an OUTFILE that already exists, you get an error. 如果您尝试将数据选择为已存在的OUTFILE，则会出现错误。 So you need to generate unique temporary file names. 因此，您需要生成唯一的临时文件名。 And then run a cron job of some sort to go clean them up. 然后运行某种类型的cron工作来清理它们。

I guess INFILE and OUTFILE behave differently. 我猜INFILE和OUTFILE的行为有所不同。 If someone can shed some light on what is going on here to explain mysql behavior, I would appreciate it. 如果有人能够解释这里发生的事情来解释mysql的行为，我将不胜感激。

D d

Here is a better way than using INFILE / OUTFILE. 这是比使用INFILE / OUTFILE更好的方法。

SET TRANSACTION ISOLATION LEVEL READ COMMITTED; SET TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO aTable SELECT ... FROM ... INSERT INTO aTable SELECT ... FROM ...

Here is a relevant post to read : 这是一篇相关的帖子：

How to improve INSERT INTO ... SELECT locking behavior 如何改进INSERT INTO ... SELECT锁定行为

Answer 2

I experiences the same issue and was playing around with subqueries that actually solved it. 我遇到了同样的问题，正在玩实际解决它的子查询。 If the select has a huge amount of rows, its taking very long to insert the data. 如果select具有大量行，则插入数据需要很长时间。 Example: 例：

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1

using LIMIT in combination to INSERT data is not a good option. 将LIMIT与INSERT数据结合使用并不是一个好选择。 So you could use 2 seperate queries for getting data and inserting, or you can use a subquery. 因此，您可以使用2个单独的查询来获取数据和插入，或者您可以使用子查询。 Example: 例：

INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT * FROM (
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1
) sub1

that would be a fast solution without changing your script. 这将是一个快速的解决方案，而无需更改脚本。

So im not sure why its taking 0.01 seconds to run the subquery and 60 seconds to run the insert. 所以我不确定为什么运行子查询需要0.01秒，运行插入需要60秒。 I get 1000+ results without the Limit. 我没有限制就获得了1000多个结果。 In my case the subquery improved performance from 60 seconds to 0.01 seconds. 在我的例子中，子查询将性能从60秒提高到0.01秒。

Answer 3

The reason has to do with how the computer reads and writes and the way a temp file works. 原因与计算机的读写方式以及临时文件的工作方式有关。 Select is reading data that is in an indexed file on the hard drive whereas insert is using a temporary file and is writing to that file. Select正在读取硬盘驱动器上索引文件中的数据，而insert正在使用临时文件并正在写入该文件。 More RAM is required and doing so is more difficult. 需要更多RAM，这样做更加困难。 As for why it takes a minute, I am not sure exactly but I think the code might be a little incorrect so that would contribute. 至于为什么需要一分钟，我不确定，但我认为代码可能有点不正确，所以这将有所贡献。

为什么MySQL'插入... select ...'比单独选择要慢得多？

问题描述

3 个解决方案

解决方案1
3 2013-11-01 09:29:53

解决方案2
1 2016-11-03 09:28:04

解决方案3
0 2013-10-09 10:36:40

为什么MySQL'插入... select ...'比单独选择要慢得多？

问题描述

3 个解决方案

解决方案1 3 2013-11-01 09:29:53

解决方案2 1 2016-11-03 09:28:04

解决方案3 0 2013-10-09 10:36:40

解决方案1
3 2013-11-01 09:29:53

解决方案2
1 2016-11-03 09:28:04

解决方案3
0 2013-10-09 10:36:40