I need to migrate data from MySQL to Postgres. It's easy to write a script that connects to MySQL and to Postgres, runs a select on the MySQL side and inserts on the Postgres side, but it is veeeeery slow (I have + 1M rows). It's much faster to write the data to a flat file and then import it.
The MySQL command line can download tables pretty fast and output them as tab-separated values, but that means executing a program external to my script (either by executing it as a shell command and saving the output to a file or by reading directly from the stdout). I am trying to download the data using Python instead of the MySQL client.
Does anyone know what steps and calls does the MySQL command line perform to query a large dataset and output it to stdout? I thought it could be just that the client is in C and should be much faster than Python, but the Python binding for MySQL is itself in C so... any ideas?
I believe that the problem is that you are inserting each row in a separate transaction (which is the default behavior when you run SQL-queries without explicitly starting a transaction). In that case, the database must write (flush) changes to disk on every INSERT
. It can be 100x times slower than inserting data in a single transaction. Try to run BEGIN
before importing data and COMMIT
after.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.