UnicodeDecodeError：'charmap'編解碼器無法解碼7240位的字節0x8d：字符映射到<undefined>

Question

我是學生在做碩士論文。 作為論文的一部分，我正在使用python 。 我正在讀取.csv格式的日志文件，並以.csv良好的方式將提取的數據寫入另一個.csv文件。 但是，當讀取文件時，我收到此錯誤：

回溯（最近一次調用最后一次）：文件“C：\\ Users \\ SGADI \\ workspace \\ DAB_Trace \\ my_code \\ trace_parcer.py”，第19行，讀取行中的行：

文件“C：\\ Users \\ SGADI \\ Desktop \\ Python-32bit-3.4.3.2 \\ python-3.4.3 \\ lib \\ encodings \\ cp1252.py”，第23行，解碼返回codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError：'charmap'編解碼器無法解碼7240位的字節0x8d：字符映射到<undefined>

import csv
import re
#import matplotlib
#import matplotlib.pyplot as plt
import datetime
#import pandas
#from dateutil.parser import parse
#def parse_csv_file():
timestamp = datetime.datetime.strptime('00:00:00.000', '%H:%M:%S.%f')
timestamp_list = []
snr_list = []
freq_list = []
rssi_list = []
dab_present_list = []
counter = 0
f =  open("output.txt","w")
with open('test_log_20150325_gps.csv') as csvfile:
    reader = csv.reader(csvfile, delimiter=';') 
    for row in reader:
        #timestamp = datetime.datetime.strptime(row[0], '%M:%S.%f')
        #timestamp.split(" ",1)

        timestamp = row[0]
        timestamp_list.append(timestamp)


        #timestamp = row[0]
        details = row[-1]
        counter += 1
        print (counter)
        #if(counter > 25000):
        #  break
        #timestamp = datetime.datetime.strptime(row[0], '%M:%S.%f')  



        #timestamp_list.append(float(timestamp))

        #search for SNRLevel=\d+
        snr = re.findall('SNRLevel=(\d+)', details)
        if snr == []:
            snr = 0
        else:
            snr = snr[0]
        snr_list.append(int(snr))

        #search for Frequency=09ABC
        freq = re.findall('Frequency=([0-9a-fA-F]+)', details)
        if freq == []:
            freq = 0
        else:
            freq = int(freq[0], 16)
        freq_list.append(int(freq))

        #search for RSSI=\d+
        rssi = re.findall('RSSI=(\d+)', details)
        if rssi == []:
            rssi = 0
        else:
            rssi = rssi[0]
        rssi_list.append(int(rssi))

        #search for DABSignalPresent=\d+
        dab_present = re.findall('DABSignalPresent=(\d+)', details)
        if dab_present== []:
            dab_present = 0
        else:
            dab_present = dab_present[0]
        dab_present_list.append(int(dab_present))

        f.write(str(timestamp) + "\t")
        f.write(str(freq) + "\t")
        f.write(str(snr) + "\t")
        f.write(str(rssi) + "\t")
        f.write(str(dab_present) + "\n")
        print (timestamp, freq, snr, rssi, dab_present)

        #print (index+1)

        #print(timestamp,freq,snr)
        #print (counter)
#print(timestamp_list,freq_list,snr_list,rssi_list)


'''if  snr != []:
           if freq != []:
               timestamp_list.append(timestamp)
               snr_list.append(snr)
               freq_list.append(freq)
f.write(str(timestamp_list) + "\t")
f.write(str(freq_list) + "\t")
f.write(str(snr_list) + "\n")

print(timestamp_list,freq_list,snr_list)'''
f.close()

我搜索了這個特殊的角色，但沒有找到任何特征。 我搜索了互聯網，建議更改格式：我嘗試了ut8，latin1和其他一些格式，但我仍然收到此錯誤。 你能幫我解決一下pandas問題嗎？ 我也試過pandas但我仍然得到錯誤。 我甚至刪除了日志文件中的一行，但錯誤發生在下一行。

請幫我找一個解決方案，謝謝。

Answer 1

我已經解決了這個問題。 我們可以使用這段代碼

import codecs
types_of_encoding = ["utf8", "cp1252"]
for encoding_type in types_of_encoding:
    with codecs.open(filename, encoding = encoding_type, errors ='replace') as csvfile:
        your code
        ....
        ....

Answer 2

with open('input.tsv','rb') as f:
for ln in f:
    decoded=False
    line=''
    for cp in ('cp1252', 'cp850','utf-8','utf8'):
        try:
            line = ln.decode(cp)
            decoded=True
            break
        except UnicodeDecodeError:
            pass
    if decoded:
        # use 'line'

UnicodeDecodeError：'charmap'編解碼器無法解碼7240位的字節0x8d：字符映射到<undefined>

問題描述

2 個解決方案

解決方案1
5 2015-06-29 10:23:43

解決方案2
1 2016-09-22 07:21:28

UnicodeDecodeError：'charmap'編解碼器無法解碼7240位的字節0x8d：字符映射到<undefined>

問題描述

2 個解決方案

解決方案1 5 2015-06-29 10:23:43

解決方案2 1 2016-09-22 07:21:28

解決方案1
5 2015-06-29 10:23:43

解決方案2
1 2016-09-22 07:21:28