[英]Why is MySQL 'insert into … select …' so much slower than a select alone?
I'm trying to store a query result in a temporary table for further processing. 我正在尝试将查询结果存储在临时表中以供进一步处理。
create temporary table tmpTest
(
a FLOAT,
b FLOAT,
c FLOAT
)
engine = memory;
insert into tmpTest
(
select a,b,c from someTable
where ...
);
But for some reason the insert takes up to a minute, whereas the subselect alone just takes a few seconds. 但由于某种原因,插入需要一分钟,而单独的子选择只需要几秒钟。 Why would it take so much longer to write the data to a temporary table instead of printing it to my SQL management tool's output???
为什么将数据写入临时表而不是将其打印到我的SQL管理工具的输出需要这么长时间?
UPDATE My Setup: MySQL 7.3.2 Cluster with 8 Debian Linux ndb data nodes 1 SQL Node (Windows Server 2012) 更新我的设置:MySQL 7.3.2集群,带有8个Debian Linux ndb数据节点1个SQL节点(Windows Server 2012)
The table I'm running the select on is a ndb table. 我正在运行select on的表是一个ndb表。
I tried to find out, if the execution plan would differ when using 'insert into..', but they look the same: (sorry for the formatting, stackoverflow doesn't have tables) 我试图找出,如果执行计划在使用'insert into ..'时会有所不同,但它们看起来一样:(抱歉格式化,stackoverflow没有表格)
id select_type table type possible_keys key key_len ref rows Extra 1 PRIMARY <subquery3> ALL \N \N \N \N \N \N 1 PRIMARY foo ref PRIMARY PRIMARY 3 <subquery3>.fooId 9747434 Using where 2 SUBQUERY someTable range PRIMARY PRIMARY 3 \N 136933000 Using where with pushed condition; Using MRR; Using temporary; Using filesort 3 MATERIALIZED tmpBar ALL \N \N \N \N 1000 \N
CREATE TABLE ... SELECT is slow, too. CREATE TABLE ... SELECT也很慢。 47 seconds vs. 5 seconds without table insert/create.
47秒与5秒没有表插入/创建。
I wrote a comment above, then stumbled across this as a workaround. 我在上面写了一条评论,然后偶然发现了这个问题。
This will accomplish what you want to do. 这将完成你想要做的事情。
SELECT * FROM aTable INTO OUTFILE '/tmp/atable.txt';
LOAD DATA INFILE '/tmp/atable.txt' INTO TABLE anotherTable;
Note that doing this means managing the /tmp tables in some way. 请注意,这样做意味着以某种方式管理/ tmp表。 If you try to SELECT data into an OUTFILE that already exists, you get an error.
如果您尝试将数据选择为已存在的OUTFILE,则会出现错误。 So you need to generate unique temporary file names.
因此,您需要生成唯一的临时文件名。 And then run a cron job of some sort to go clean them up.
然后运行某种类型的cron工作来清理它们。
I guess INFILE and OUTFILE behave differently. 我猜INFILE和OUTFILE的行为有所不同。 If someone can shed some light on what is going on here to explain mysql behavior, I would appreciate it.
如果有人能够解释这里发生的事情来解释mysql的行为,我将不胜感激。
D d
Here is a better way than using INFILE / OUTFILE. 这是比使用INFILE / OUTFILE更好的方法。
SET TRANSACTION ISOLATION LEVEL READ COMMITTED; SET TRANSACTION ISOLATION LEVEL READ COMMITTED; INSERT INTO aTable SELECT ... FROM ...
INSERT INTO aTable SELECT ... FROM ...
Here is a relevant post to read : 这是一篇相关的帖子:
How to improve INSERT INTO ... SELECT locking behavior 如何改进INSERT INTO ... SELECT锁定行为
I experiences the same issue and was playing around with subqueries that actually solved it. 我遇到了同样的问题,正在玩实际解决它的子查询。 If the select has a huge amount of rows, its taking very long to insert the data.
如果select具有大量行,则插入数据需要很长时间。 Example:
例:
INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1
using LIMIT in combination to INSERT data is not a good option. 将LIMIT与INSERT数据结合使用并不是一个好选择。 So you could use 2 seperate queries for getting data and inserting, or you can use a subquery.
因此,您可以使用2个单独的查询来获取数据和插入,或者您可以使用子查询。 Example:
例:
INSERT INTO b2b_customers (b2b_name, b2b_address, b2b_language)
SELECT * FROM (
SELECT customer_name, customer_address, customer_language
FROM customers
WHERE customer_name LIKE "%john%"
ORDER BY customer_created_date DESC
LIMIT 1
) sub1
that would be a fast solution without changing your script. 这将是一个快速的解决方案,而无需更改脚本。
So im not sure why its taking 0.01 seconds to run the subquery and 60 seconds to run the insert. 所以我不确定为什么运行子查询需要0.01秒,运行插入需要60秒。 I get 1000+ results without the Limit.
我没有限制就获得了1000多个结果。 In my case the subquery improved performance from 60 seconds to 0.01 seconds.
在我的例子中,子查询将性能从60秒提高到0.01秒。
The reason has to do with how the computer reads and writes and the way a temp file works. 原因与计算机的读写方式以及临时文件的工作方式有关。 Select is reading data that is in an indexed file on the hard drive whereas insert is using a temporary file and is writing to that file.
Select正在读取硬盘驱动器上索引文件中的数据,而insert正在使用临时文件并正在写入该文件。 More RAM is required and doing so is more difficult.
需要更多RAM,这样做更加困难。 As for why it takes a minute, I am not sure exactly but I think the code might be a little incorrect so that would contribute.
至于为什么需要一分钟,我不确定,但我认为代码可能有点不正确,所以这将有所贡献。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.