Error while importing csv in Python using pandas

Question

I have started to learn Python for data science. I am already using R on almost daily basis. I stack on first step. I try to import csv file using Pandas read_csv file method. I have problem with encoding the file while importing.

If I use read.csv from R everything is ok:

df <- read.csv2("some_path/myfile.txt", stringsAsFactors = FALSE, encoding = 'UTF-8')

but if I use similar code in python:

import pandas as pd
df = pd.read_csv("some_path/myfile.txt", sep = ';', encoding= 'utf8')

it returns an error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 13: invalid continuation byte

How is it possible that I can import a file with "utf-8" encoding in R, but not in Python?

If I use different encoding (latin1 or iso-8859-1), it imports the file successfully but characters are not encoded in right way.

Answer 1

即使我不明白为什么UTF-8可以在R中运行而不能在Python中运行，我仍然发现cp1250编码可以正常工作。

Answer 2

Use encoding "UTF-16". I used that to resolve my issue with the same error.

Error while importing csv in Python using pandas

Question

2 answers

solution1
0 2017-02-11 22:48:02

solution2
-1 2018-02-26 22:37:43

Error while importing csv in Python using pandas

Question

2 answers

solution1 0 2017-02-11 22:48:02

solution2 -1 2018-02-26 22:37:43

solution1
0 2017-02-11 22:48:02

solution2
-1 2018-02-26 22:37:43