Python從文件編碼問題中讀取

Question

當我這樣讀，有些文件

list_of_files = glob.glob('./*.txt') # create the list of files
for file_name in list_of_files:
    FI = open(file_name, 'r', encoding='cp1252')

錯誤：

UnicodeDecodeError：“ charmap”編解碼器無法解碼位置1260的字節0x9d：字符映射到

當我切換到這個

list_of_files = glob.glob('./*.txt') # create the list of files
for file_name in list_of_files:
    FI = open(file_name, 'r', encoding="utf-8")

錯誤：

UnicodeDecodeError：“ utf-8”編解碼器無法解碼位置1459中的字節0x92：無效的起始字節

我已經讀過我應該以二進制文件形式打開它。 但我不知道該怎么做。 這是我的功能：

def readingAndAddToList():
    list_of_files = glob.glob('./*.txt') # create the list of files
    for file_name in list_of_files:
        FI = open(file_name, 'r', encoding="utf-8")
        stext = textProcessing(FI.read())# split returns a list of words delimited by sequences of whitespace (including tabs, newlines, etc, like re's \s)
        secondaryWord_list = stext.split()
        word_list.extend(secondaryWord_list) # Add words to main list
        print("Lungimea fisierului ",FI.name," este de", len(secondaryWord_list), "caractere")
        sortingAndNumberOfApparitions(secondaryWord_list)
        FI.close()

只是我的函數的開始很重要，因為我在閱讀部分遇到了錯誤

Answer 1

如果您在Windows上，請在NotePad中打開該文件並保存所需的編碼。 在Linux中，在文本編輯器中也一樣。 希望你的程序運行。

Python從文件編碼問題中讀取

問題描述

1 個解決方案

解決方案1
0 2019-03-19 14:03:25

Python從文件編碼問題中讀取

問題描述

1 個解決方案

解決方案1 0 2019-03-19 14:03:25

解決方案1
0 2019-03-19 14:03:25