Python Pandas MySQL - Why is SQLite so much faster when writing dataframes to a database

Question

I'm developing a website where users import csv files directly to a database and a front end that performs some data analytics on the data once it has been filed in the database. I'm using pandas to convert the csv to a dataframe and to subsequently import that dataframe into the MySQL database:

Import to MySQL database:

engine = create_engine('mysql+mysqlconnector://[username]:[password]@[host]:[port]/[schema]', echo=False)
df = pd.read_csv('C:/Users/[user]/Documents/Sales_Records.csv')
df.to_sql(con= engine, name='data', if_exists='replace')

The problem with this is that for the datasets I work with (5 million rows), the performance is too slow and the action times out without importing the data. However, if I try the same thing except using SQLite3:

import to SQLite3 database:

conn = sqlite3.connect('customer.db')
df = pd.read_csv('C:/Users/[user]/Documents/Sales_Records.csv')
df.to_sql('Sales', conn, if_exists='append', index=False)
mycursor = conn.cursor()
query = 'SELECT * FROM Sales LIMIT 10'
print(mycursor.execute(query).fetchall())

This block of code executes in seconds and imports all 5 million rows of the dataset. So what should I do? I do not anticipate multiple people passing in large datasets all at the same time so I suppose it would not hurt to just ditch MySQL for the clear performance advantages provided by SQLite in this application. It just feels like there's a better way though...

Answer 1

MySQL sends the data to a disk over a network connection.

SQLite3 send the data over a disk directly.

Look at https://gist.github.com/jboner/2841832

You did not mention where the MySQL server is. But even if it was on your local machine, it will pass through a TCP/IP stack whereas SQLite will just write directly to disk.

Python Pandas MySQL - Why is SQLite so much faster when writing dataframes to a database

Question

1 answers

solution1
0 ACCPTED 2021-04-08 02:26:35

Python Pandas MySQL - Why is SQLite so much faster when writing dataframes to a database

Question

1 answers

solution1 0 ACCPTED 2021-04-08 02:26:35

solution1
0 ACCPTED 2021-04-08 02:26:35