I have a pandas dataframe with many different columns and will like to insert specific columns into a mySQL database.
I understand from the documentation that the way to do so is to use to_sql like so:
import pandas as pd
import mysql.connector
from sqlalchemy import create_engine
engine = create_engine(db)
data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)
However, when I run this in python3, I am getting the following ImportError:
Traceback (most recent call last):
File "myfile.py", line 130, in <module>
engine = create_engine('mysql+mysqlconnector://user:password@localhost:port/db')
File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/__init__.py", line 387, in create_engine
return strategy.create(*args, **kwargs)
File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/engine/strategies.py", line 80, in create
dbapi = dialect_cls.dbapi(**dbapi_args)
File "/home/pohzihow/.local/lib/python3.5/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 110, in dbapi
return __import__('MySQLdb')
ImportError: No module named 'MySQLdb'
Upon doing some research I found out that actually mySQLdb does not support python3 , but the documentation on pandas says that it requires the SQLAlchemy engine to connect to the database.
Is there a way to output my pandas dataframe into a mySQL database directly using python3?
SQLAlchemy has support for different mySQL DBAPIs, as described in the docs . At the time of writing, these are:
Try using one of the APIs listed instead of mySQLdb.
Expanding on ƘɌỈSƬƠƑ's answer using pymysql as the DBAPI
import pandas as pd
import pymysql
from sqlalchemy import create_engine
engine = create_engine("mysql+pymysql://USER:PASSWORD@HOST/DATABASE")
con = engine.connect()
data.to_sql(name='dbtable', con=engine, if_exists = 'append', index=False)
Use PyMySQL module. It's the latest one and you need to MySQL running in the background of your system. The code goes this way,
import pymysql, pandas as pd
part1 = pd.read_csv("your_file_location)
part2 = part1["Column 1 Name", "Column2 Name"]
#and further code of yours#
for i in your_pandas_dataset:
sql = "INSERT INTO your_table_name(your_column_attributes) \
VALUES ('%d', '%s', #and so on#) %\
(i[1], i[2],#and so on#)"
self.cursor.execute(sql)
self.db.commit()
You need to insert data using loops. And the data will be imported into SQL database.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.