I am cycling through the rows of a csv file, but come across this error when looping through the rows: 'utf-8' codec can't decode byte 0xd5 in position 2912: invalid continuation byte
I'm just trying to get the row count for the file with this function:
def count_lines(filename):
row_stored = ""
try:
with open(filename) as csvfile:
data_reader = csv.reader(csvfile)
next(data_reader)
count = 0
for index, row in enumerate(data_reader):
if index == 1220119:
print(row)
row_stored = row
count += 1
return count
except Exception as e:
print(f'There was a problem with your request: {e}\n', row_stored)
return False
The row above the erroring row looks like this:
['817949019495', 'QMMZN1300568', '4/28/2017', 'Digital Revenue', 'Track', 'Download Europe', 'GB', 'Amazon International - UK', '', '2', '1.2126506333579932', '109926407', '2/28/2017']
And the row that throws the error looks like this:
['817949019495', 'QMMZN1300568', '4/28/2017', 'Digital Revenue', 'Track', 'Download Europe', 'GB', 'Amazon International - UK', '', '2', '1.2126506333579932', '109926407', '2/28/2017']
I don't see any differences in the two. Is there something with the formatting of this particular row that I'm not seeing?
Note: This csv file is 3.17 GB. Don't know if that's a contributing factor
更改编码解决了这个问题
with open(filename, encoding="ISO-8859-1") as csvfile:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.