简体   繁体   中英

Python Pandas MySQL - Why is SQLite so much faster when writing dataframes to a database

I'm developing a website where users import csv files directly to a database and a front end that performs some data analytics on the data once it has been filed in the database. I'm using pandas to convert the csv to a dataframe and to subsequently import that dataframe into the MySQL database:

Import to MySQL database:

engine = create_engine('mysql+mysqlconnector://[username]:[password]@[host]:[port]/[schema]', echo=False)
df = pd.read_csv('C:/Users/[user]/Documents/Sales_Records.csv')
df.to_sql(con= engine, name='data', if_exists='replace')

The problem with this is that for the datasets I work with (5 million rows), the performance is too slow and the action times out without importing the data. However, if I try the same thing except using SQLite3:

import to SQLite3 database:

conn = sqlite3.connect('customer.db')
df = pd.read_csv('C:/Users/[user]/Documents/Sales_Records.csv')
df.to_sql('Sales', conn, if_exists='append', index=False)
mycursor = conn.cursor()
query = 'SELECT * FROM Sales LIMIT 10'
print(mycursor.execute(query).fetchall())

This block of code executes in seconds and imports all 5 million rows of the dataset. So what should I do? I do not anticipate multiple people passing in large datasets all at the same time so I suppose it would not hurt to just ditch MySQL for the clear performance advantages provided by SQLite in this application. It just feels like there's a better way though...

MySQL sends the data to a disk over a network connection.

SQLite3 send the data over a disk directly.

Look at https://gist.github.com/jboner/2841832

You did not mention where the MySQL server is. But even if it was on your local machine, it will pass through a TCP/IP stack whereas SQLite will just write directly to disk.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM