I have a csv
file that contains some data with columns names:
I have a problem with the third one "IAS_lissé" which is misinterpreted by pd.read_csv()
method and returned as .
What is that character?
Because it's generating a bug in my flask application, is there a way to read that column in an other way without modifying the file?
In [1]: import pandas as pd
In [2]: pd.read_csv("Openhealth_S-Grippal.csv",delimiter=";").columns
Out[2]: Index([u'PERIODE', u'IAS_brut', u'IAS_liss�', u'Incidence_Sentinelles'], dtype='object')
You can change the encoding
parameter for read_csv, see the pandas doc here . Also the python standard encodings are here .
I believe for your example you can use the utf-8
encoding (assuming that your language is French).
df = pd.read_csv("Openhealth_S-Grippal.csv", delimiter=";", encoding='utf-8')
Here's an example showing some sample output. All I did was make a csv file with one column, using the problem characters.
df = pd.read_csv('sample.csv', encoding='utf-8')
Output:
IAS_lissé
0 1
1 2
2 3
I found the same problem with spanish, solved it with with "latin1" encoding:
import pandas as pd
pd.read_csv("Openhealth_S-Grippal.csv",delimiter=";", encoding='latin1')
Hope it helps!
Using utf-8 didn't work for me. Eg this piece of code:
bla = pd.DataFrame(data = [1, 2])
bla.to_csv('funkyNamé , things.csv')
blabla = pd.read_csv('funkyNamé , things.csv', delimiter=";", encoding='utf-8')
blabla
Ultimately returned: OSError: Initializing from file failed
I know you said you didn't want to modify the file. If you meant the file content vs the filename, I would rename the file to something without an accent, read the csv file under its new name, then reset the filename back to its original name.
originalfilepath = r'C:\Users\myself\\funkyNamé , things.csv'
originalfolder = r'C:\Users\myself'
os.rename(originalfilepath, originalFolder+"\\tempName.csv")
df = pd.read_csv(originalFolder+"\\tempName.csv", encoding='ISO-8859-1')
os.rename(originalFolder+"\\tempName.csv", originalfilepath)
If you did mean "without modifying the file name , my apologies for not being helpful to you, and I hope this helps someone else.
Try using:
import pandas as pd
df = pd.read_csv('file_name.csv', encoding='utf-8-sig')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.