简体   繁体   中英

Unicode decode error while iterating through all txt files in folder

I would like to write the expression | SYS | SYS into each single txt file in the folder. However I am getting the Unicode decode error. I have suspicion that it might be because of missing r in the string with open(txt_file, "r") as f:

My code is:

import os
import csv
import glob

cwd = os.getcwd()

directory = cwd

output = cwd

txt_files = os.path.join(directory, '*.txt')

for txt_file in glob.glob(txt_files):
    with open(txt_file, "r") as f:
        a = f.read()
        print(a)
#Now writing into the file with the prepend line + old file data
    with open(txt_file, "w") as f:
        f.write("|   SYS" + a)
        #below code to verify the data in the file
        with open(txt_file, "r") as f:
            b = f.read()
            print(b)

And the error is:

Traceback (most recent call last):
  File "C:/Users/xxxxxx/Downloads/TEST2/Searchcombine.py", line 15, in <module>
    a = f.read()
  File "C:\Python\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 1060662: character maps to <undefined>

您可以在调用open()时尝试设置编码参数:

with open(txt_file, "r", encoding="utf-8") as f:

尽管对于大多数文件而言,这不是最安全的方法,但我还是通过在行(txt_file, "r") as f:添加ignore error来解决它,并(txt_file, "r") as f:使其成为(txt_file, errors='ignore') as f:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM