简体   繁体   中英

Pandas dataframe to_sql method with python3

I have a pandas dataframe with many different columns and will like to insert specific columns into a mySQL database.

I understand from the documentation that the way to do so is to use to_sql like so:

import pandas as pd
import mysql.connector
from sqlalchemy import create_engine

engine = create_engine(db)
data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)

However, when I run this in python3, I am getting the following ImportError:

Traceback (most recent call last):
  File "myfile.py", line 130, in <module>
    engine = create_engine('mysql+mysqlconnector://user:password@localhost:port/db')
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
    return strategy.create(*args, **kwargs)
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/strategies.py", line 80, in create
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi
    return __import__('MySQLdb')
ImportError: No module named 'MySQLdb'

Upon doing some research I found out that actually mySQLdb does not support python3 , but the documentation on pandas says that it requires the SQLAlchemy engine to connect to the database.

Is there a way to output my pandas dataframe into a mySQL database directly using python3?

SQLAlchemy has support for different mySQL DBAPIs, as described in the docs . At the time of writing, these are:

  • MySQL-Python
  • PyMySQL
  • MySQL Connector/Python
  • CyMySQL
  • OurSQL
  • Google Cloud SQL
  • PyODBC
  • zxjdbc for Jython

Try using one of the APIs listed instead of mySQLdb.

Expanding on ƘɌỈSƬƠƑ's answer using pymysql as the DBAPI

import pandas as pd
import pymysql
from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://USER:PASSWORD@HOST/DATABASE")
con = engine.connect()

data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)

Use PyMySQL module. It's the latest one and you need to MySQL running in the background of your system. The code goes this way,

import pymysql, pandas as pd
part1 = pd.read_csv("your_file_location)
part2 = part1["Column 1 Name", "Column2 Name"]
#and further code of yours#
for i in your_pandas_dataset:
    sql = "INSERT INTO your_table_name(your_column_attributes) \
            VALUES ('%d', '%s', #and so on#) %\
            (i[1], i[2],#and so on#)"
            self.cursor.execute(sql)
            self.db.commit()

You need to insert data using loops. And the data will be imported into SQL database.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM