简体   繁体   中英

Error while importing csv in Python using pandas

I have started to learn Python for data science. I am already using R on almost daily basis. I stack on first step. I try to import csv file using Pandas read_csv file method. I have problem with encoding the file while importing.

If I use read.csv from R everything is ok:

df <- read.csv2("some_path/myfile.txt", stringsAsFactors = FALSE, encoding = 'UTF-8')

but if I use similar code in python:

import pandas as pd
df = pd.read_csv("some_path/myfile.txt", sep = ';', encoding= 'utf8')

it returns an error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 13: invalid continuation byte

How is it possible that I can import a file with "utf-8" encoding in R, but not in Python?

If I use different encoding (latin1 or iso-8859-1), it imports the file successfully but characters are not encoded in right way.

即使我不明白为什么UTF-8可以在R中运行而不能在Python中运行,我仍然发现cp1250编码可以正常工作。

Use encoding "UTF-16". I used that to resolve my issue with the same error.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM