簡體   English   中英

在 UTF-8 - Python 中寫入 txt 文件

[英]Writing to txt file in UTF-8 - Python

我的 django 應用程序從用戶那里獲取文檔,創建一些關於它的報告,然后寫入txt文件。 有趣的問題是一切都在我的 Mac OS 上運行良好。 但在 Windows 上,它無法讀取某些字母,將其轉換為é™ä±等符號。 這是我的代碼:

views.py

def result(request):
    last_uploaded = OriginalDocument.objects.latest('id')
    original = open(str(last_uploaded.document), 'r')
    original_words = original.read().lower().split()
    words_count = len(original_words)
    open_original = open(str(last_uploaded.document), "r")
    read_original = open_original.read()
    characters_count = len(read_original)
    report_fives = open("static/report_documents/" + str(last_uploaded.student_name) + 
    "-" + str(last_uploaded.document_title) + "-5.txt", 'w', encoding="utf-8")
    # Path to the documents with which original doc is comparing
    path = 'static/other_documents/doc*.txt'
    files = glob.glob(path)
    #endregion

    rows, found_count, fives_count, rounded_percentage_five, percentage_for_chart_five, fives_for_report, founded_docs_for_report = search_by_five(last_uploaded, 5, original_words, report_fives, files)


    context = {
        ...
    }

    return render(request, 'result.html', context)

report txt file

['universitetindé™', 'té™hsili', 'alä±ram.', 'mé™n'] was found in static/other_documents\doc1.txt.
...

這里的問題是您在未指定編碼的情況下對文件調用open() Python 文檔中所述,默認編碼取決於平台。 這可能就是您在 Windows 和 MacOS 中看到不同結果的原因。

假設文件本身實際編碼為 UTF-8,只需在讀取文件時指定:

original = open(str(last_uploaded.document), 'r', encoding="utf-8")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM