简体   繁体   English

在 Ubuntu 18.04 上使用 mysqlimport 在版本 5.7.31 中导入 mysql 中的数据缓慢

[英]Slow import of data in mysql using mysqlimport in version 5.7.31 on Ubuntu 18.04

Slow import of data in mysql using mysqlimport in version 5.7.31 on Ubuntu 18.04 Im loading genetic data from Ensembl and following the directions here https://m.ensembl.org/info/docs/webcode/mirror/install/ensembl-data.html . Slow import of data in mysql using mysqlimport in version 5.7.31 on Ubuntu 18.04 Im loading genetic data from Ensembl and following the directions here https://m.ensembl.org/info/docs/webcode/mirror/install/ensembl-data .html

this is the command I am using这是我正在使用的命令

/data/mysql/bin/mysqlimport -u mysqldba --fields-terminated-by='\t' --fields-escaped-by=\\ homo_sapiens_core_100_38 -L allele_synonyms.txt

A few of the tables are massive.有几张桌子很大。 40GB, 90GB and ~300GB. 40GB、90GB 和 ~300GB。 I didn't expect this to go fast but it seems to be going way too slow.我没想到 go 会很快,但似乎进展得太慢了。 I can import 10GB tables in an hour but this 40GB one is killing me.我可以在一小时内导入 10GB 的表,但是这个 40GB 的表让我很生气。 I even used slice to make small chunks and load them but at 20GB it seems to get infinitely longer.我什至使用 slice 来制作小块并加载它们,但在 20GB 时它似乎变得无限长。 Like days to load the data and that's when I add just one more 5GB chunk after 20GB is loaded.就像加载数据的日子一样,那是我在加载 20GB 后仅添加一个 5GB 块的时候。

I have followed other suggestions like this post here Improve performance of mysql LOAD DATA / mysqlimport?我在这里遵循了类似这篇文章的其他建议提高 mysql LOAD DATA / mysqlimport 的性能?

I have turned off slow query logging in mysql.cnf and tried other suggestions found on Stack Overflow like increasing the buffers.我已经关闭了 mysql.cnf 中的慢查询日志记录,并尝试了 Stack Overflow 上的其他建议,例如增加缓冲区。

Im using a very large space for my temp files I'm specifically using mysqlimport on a txt.file The 40GB file has like 800,000,000 Rows so its a lot of data.我为我的临时文件使用了一个非常大的空间我专门在 txt.file 上使用 mysqlimport 40GB 文件有 800,000,000 行,所以它有很多数据。 I have tried adding --num-threds=4 The machine has decent specs 4 cores and 14GB ram我尝试添加 --num-threds=4 这台机器有不错的规格 4 核和 14GB 内存

Im not sure if this is the correct answer but this is what worked for me.我不确定这是否是正确的答案,但这对我有用。 My data was like 99.9% read so I was trying to us MyISAM for the massive tables.我的数据读取率为 99.9%,因此我尝试使用 MyISAM 来处理海量表。 By simply changing the tables to InnoDB it worked.只需将表更改为 InnoDB 即可。 I did this for any table over 25GB and it worked.我对任何超过 25GB 的表都这样做了,而且效果很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM