简体   繁体   中英

Python Dataframe to CSV - UnicodeEncodeError: 'charmap' codec can't encode characters

I use the following code to save roughly 6.000 scraped profiles from a dataframe to a csv:

profiles.to_csv (r'C:\Users\alexa\Desktop\profiles.csv', index = False, header=True, encoding="cp1252" )

Inbetween the script stops giving me the following error message. The csv file contains roughly 1.500 profiles which were successfully written into the csv before it breaks - does anyone know how to solve this?

Traceback (most recent call last): File "C:\Users\alexa\PycharmProjects\cameo\main.py", line 75, in profiles.to_csv (r'C:\Users\alexa\Desktop\cameo_profiles.csv', index = False, header=True, encoding="cp1252") File "C:\Users\alexa\PycharmProjects\cameo\venv\lib\site-packages\pandas\core\generic.py", line 3466, in to_csv return DataFrameRenderer(formatter).to_csv( File "C:\Users\alexa\PycharmProjects\cameo\venv\lib\site-packages\pandas\io\formats\format.py", line 1105, in to_csv csv_formatter.save() File "C:\Users\alexa\PycharmProjects\cameo\venv\lib\site-packages\pandas\io\formats\csvs.py", line 257, in save self._save() File "C:\Users\alexa\PycharmProjects\cameo\venv\lib\site-packages\pandas\io\formats\csvs.py", line 262, in _save self._save_body() File "C:\Users\alexa\PycharmProjects\cameo\venv\lib\site-packages\pandas\io\formats\csvs.py", line 300, in _save_body self._save_chunk(start_i, end_i) File "C:\Users\alexa\PycharmProjects\cameo\venv\lib\site-packages\pandas\io\formats\csvs. py", line 311, in _save_chunk libwriters.write_csv_rows( File "pandas_libs\writers.pyx", line 72, in pandas._libs.writers.write_csv_rows File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.2544.0_x64__qbz5n2kfra8p0\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 15-22: character maps to

Process finished with exit code 1

The error says that some cells in your dataframe contain character that cannot be converted into the cp1252 charset. If you have a recent version of Pandas (>= 1.0) you can use the errors parameter of to_csv . For example errors='replace' will just put a replacement character (often ? ) for any offending character:

profiles.to_csv (r'C:\Users\alexa\Desktop\profiles.csv', index = False,
                 header=True, encoding="cp1252", errors='replace' )

Alternatively, you could try to use UTF-8 which can represent any unicode character:

profiles.to_csv (r'C:\Users\alexa\Desktop\profiles.csv', index = False,
                 header=True, encoding="utf8" )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM