Pandas dataframe to_sql method with python3

Question

I have a pandas dataframe with many different columns and will like to insert specific columns into a mySQL database.

I understand from the documentation that the way to do so is to use to_sql like so:

import pandas as pd
import mysql.connector
from sqlalchemy import create_engine

engine = create_engine(db)
data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)

However, when I run this in python3, I am getting the following ImportError:

Traceback (most recent call last):
  File "myfile.py", line 130, in <module>
    engine = create_engine('mysql+mysqlconnector://user:password@localhost:port/db')
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
    return strategy.create(*args, **kwargs)
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/strategies.py", line 80, in create
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi
    return __import__('MySQLdb')
ImportError: No module named 'MySQLdb'

Upon doing some research I found out that actually mySQLdb does not support python3 , but the documentation on pandas says that it requires the SQLAlchemy engine to connect to the database.

Is there a way to output my pandas dataframe into a mySQL database directly using python3?

Answer 1

SQLAlchemy has support for different mySQL DBAPIs, as described in the docs . At the time of writing, these are:

MySQL-Python
PyMySQL
MySQL Connector/Python
CyMySQL
OurSQL
Google Cloud SQL
PyODBC
zxjdbc for Jython

Try using one of the APIs listed instead of mySQLdb.

Answer 2

Expanding on ƘɌỈSƬƠƑ's answer using pymysql as the DBAPI

import pandas as pd
import pymysql
from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://USER:PASSWORD@HOST/DATABASE")
con = engine.connect()

data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)

Answer 3

Use PyMySQL module. It's the latest one and you need to MySQL running in the background of your system. The code goes this way,

import pymysql, pandas as pd
part1 = pd.read_csv("your_file_location)
part2 = part1["Column 1 Name", "Column2 Name"]
#and further code of yours#
for i in your_pandas_dataset:
    sql = "INSERT INTO your_table_name(your_column_attributes) \
            VALUES ('%d', '%s', #and so on#) %\
            (i[1], i[2],#and so on#)"
            self.cursor.execute(sql)
            self.db.commit()

You need to insert data using loops. And the data will be imported into SQL database.

Pandas dataframe to_sql method with python3

Question

3 answers

solution1
3 ACCPTED 2017-05-19 10:04:10

solution2
2 2018-06-02 14:38:30

solution3
0 2017-05-19 06:50:49

Pandas dataframe to_sql method with python3

Question

3 answers

solution1 3 ACCPTED 2017-05-19 10:04:10

solution2 2 2018-06-02 14:38:30

solution3 0 2017-05-19 06:50:49

solution1
3 ACCPTED 2017-05-19 10:04:10

solution2
2 2018-06-02 14:38:30

solution3
0 2017-05-19 06:50:49