简体   繁体   中英

UTF-8 Encoding Pandas DataFrame to MySQL

I am using Pandas, MySQL, and SQLAlchemy.

I am currently using pandas.to_sql() to submit my pandas DataFrame -- that contains non-ASCII text -- to my MySQL database. Note that with other DataFrames I've used, my process works perfectly

However, with non-ascii text, the following call raises an error:

df.to_sql(
        name='tableOne',
        con=engine,
        index=False,
        if_exists='replace')

Resulting Error (Last line):

UnicodeEncodeError: 'ascii' codec can't encode character '\xe5' in position 143: ordinal not in range(128)

After reading similar posts, I've found the answer is to specify 'utf8' in the engine, as below:

engine = create_engine(f'mysql+mysqldb://{MYSQL_USER}:{MYSQL_PASSWORD}@{MYSQL_HOST}:{MYSQL_PORT}/{MYSQL_DATABASE}?charset=utf8', encoding="utf8")

I have tried everything I could think of and everything I could find online. I changed my engine to include every comination of 'utf8', 'utf-8', etc. I've tried manually encoding each text column of my DF to UTF-8. None of this works

I've worked off the following other, similar SO posts, which seem to have similar issues and, therefore, similar solutions, yet nothing has worked for me:

Another UnicodeEncodeError when using pandas method to_sql with MySQL

How to handle encoding in Python 2.7 and SQLAlchemy 🏴‍☠️

How do I get SQLAlchemy to correctly insert a unicode ellipsis into a mySQL table?

convert pandas dataframe to utf8

Thanks for any help!

Edit: fixed brainfart: 'Unicode' -> 'ascii'

e5 is not ascii, nor is it UTF-8. With CHARACTER SET latin1 it is å ; does that make sense? So, I suggest you have 2 errors: One place is asking for ascii, one is providing latin1, and one is expecting utf8.

However, as @lenz points out, E5 could also be the first byte of a CJK character. The error message, since it is looking for ascii, only shows the first offending byte. Seeing the next two bytes would confirm this.

I think you are missing use_unicode=True .

See python and Panda notes farther down that page.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM