简体   繁体   English

使用pandas DataFrame将python字典导出到.csv文件时如何解决(我认为是)编码问题?

[英]How to fix (what I think is) an encoding issue when exporting python dictionary to .csv file using pandas DataFrame?

I'm new to python, and I'm trying to scrape soccer transfers from a website ( https://www.transfermarkt.co.uk ).我是 Python 新手,我正在尝试从网站 ( https://www.transfermarkt.co.uk ) 上抓取足球转会信息。 I wrote a bunch of code cleaning up the scraped data and now I've tried exporting to a .csv file using DataFrame.我写了一堆代码来清理抓取的数据,现在我尝试使用 DataFrame 导出到 .csv 文件。 When I export the data from a dictionary, some characters (like tilde ñ) are automatically capitalized and have what seems to be a completely random special character in front of them (like '¡' or '@').当我从字典中导出数据时,某些字符(如波浪号 ñ)会自动大写,并且它们前面似乎是一个完全随机的特殊字符(如“¡”或“@”)。

I've imported DataFrame from pandas.我已经从大熊猫导入了 DataFrame。 I'm using windows excel to open the .csv file.我正在使用 windows excel 打开 .csv 文件。 When printed by the python console, all letters appear normal (not capitalized and without the special character).当由 python 控制台打印时,所有字母都显示正常(不大写且没有特殊字符)。 All my code works, the issue is when exporting it to the .csv.我所有的代码都有效,问题在于将其导出到 .csv 时。

df = pd.DataFrame(dict_players)

file_path = dirname + '/' + league + '_' + date + ".csv"

export_csv = df.to_csv (file_path, index = None, header=True)

Here is an example from the .csv file that I copied:这是我复制的 .csv 文件中的一个示例:

"Michaël" “迈克尔”

This has to do with the encoding it is using.这与它使用的编码有关。 The default is utf-8 and it has a byte structure.默认值为 utf-8,它具有字节结构。 Some of the values not included in utf-8 are latin small letters i with diaeresis, right-pointing double angle quotation mark, inverted question mark.一些未包含在 utf-8 中的值是带分音符的拉丁小写字母 i、右指双角引号、倒问号。 Therefore you can try changing your encoding to latin-1.因此,您可以尝试将编码更改为 latin-1。

export_csv = df.to_csv(file_path, index = None, header=True,encoding='latin-1')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM