简体   繁体   中英

Why is MySQL command line so fast vs. Python?

I need to migrate data from MySQL to Postgres. It's easy to write a script that connects to MySQL and to Postgres, runs a select on the MySQL side and inserts on the Postgres side, but it is veeeeery slow (I have + 1M rows). It's much faster to write the data to a flat file and then import it.

The MySQL command line can download tables pretty fast and output them as tab-separated values, but that means executing a program external to my script (either by executing it as a shell command and saving the output to a file or by reading directly from the stdout). I am trying to download the data using Python instead of the MySQL client.

Does anyone know what steps and calls does the MySQL command line perform to query a large dataset and output it to stdout? I thought it could be just that the client is in C and should be much faster than Python, but the Python binding for MySQL is itself in C so... any ideas?

I believe that the problem is that you are inserting each row in a separate transaction (which is the default behavior when you run SQL-queries without explicitly starting a transaction). In that case, the database must write (flush) changes to disk on every INSERT . It can be 100x times slower than inserting data in a single transaction. Try to run BEGIN before importing data and COMMIT after.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM