I have started to learn Python for data science. I am already using R on almost daily basis. I stack on first step. I try to import csv file using Pandas read_csv file method. I have problem with encoding the file while importing.
If I use read.csv from R everything is ok:
df <- read.csv2("some_path/myfile.txt", stringsAsFactors = FALSE, encoding = 'UTF-8')
but if I use similar code in python:
import pandas as pd
df = pd.read_csv("some_path/myfile.txt", sep = ';', encoding= 'utf8')
it returns an error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 13: invalid continuation byte
How is it possible that I can import a file with "utf-8" encoding in R, but not in Python?
If I use different encoding (latin1 or iso-8859-1), it imports the file successfully but characters are not encoded in right way.
即使我不明白为什么UTF-8可以在R中运行而不能在Python中运行,我仍然发现cp1250编码可以正常工作。
Use encoding "UTF-16". I used that to resolve my issue with the same error.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.