[英]How to use filecmp.cmp() in Django for InMemoryUploadedFile objects?
[英]What character set is "é" from? (Python: Filename with "é", how to use os.path.exists , filecmp.cmp, shutil.move?)
é
來自哪個字符集? 在 Windows 中,在 ANSI 文本文件中具有此字符的記事本可以很好地保存。 插入類似的東西你會得到一個錯誤。
é
似乎在 Putty 的 ASCII 終端中工作正常(CP437 和 IBM437 是否相同?) 才不是。
我知道是 Unicode,不是 ASCII。 但是
é
是什么? 它沒有給出我在記事本中使用 Unicode 得到的錯誤,但是 Python 拋出SyntaxError: Non-ASCII character '\xc3' in file on line, but no encoding declared;
在我按照Python NLTK: SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP) 的建議添加“魔術評論”之前。
我添加了“魔術注釋”並且沒有收到該錯誤,但是 os.path.isfile() 說帶有é
的文件名不存在。 具有諷刺意味的是,字符é
在錯誤鏈接到的 PEP 的作者Marc-André Lemburg
中。
編輯:如果我打印文件的路徑,重音符號 e 顯示為├⌐
但我可以將é
復制並粘貼到命令提示符中。
EDIT2:見下文
Private > cat scratch.py ### LOL cat scratch :3
# coding=utf-8
file_name = r"Filéname"
file_name = unicode(file_name)
Private > python scratch.py
Traceback (most recent call last):
File "scratch.py", line 3, in <module>
file_name = unicode(file_name)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
Private >
編輯3:
Private > PS1="Private > " ; echo code below ; cat scratch.py ; echo ======= ; echo output below ; python scratch.py
code below
# -*- coding: utf-8 -*-
file_name = r"Filéname"
file_name = unicode(file_name, encoding="utf-8")
# I have code here to determine a path depending on the hostname of the
# machine, the folder paths contain no Unicode characters, for my debug
# version of the script, I will hardcode the redacted hostname.
hostname = "One"
if hostname == "One":
folder = "C:/path/folder_one"
elif hostname == "Two":
folder = "C:/path/folder_two"
else:
folder = "C:/path/folder_three"
path = "%s/%s" % (folder, file_name)
path = unicode(path, encoding="utf-8")
print path
=======
output below
Traceback (most recent call last):
File "scratch.py", line 18, in <module>
path = unicode(path, encoding="utf-8")
TypeError: decoding Unicode is not supported
Private >
你需要告訴unicode
字符串是什么編碼,在這種情況下它是utf-8
而不是ascii
,文件 header 應該是# -*- coding: utf-8 -*-
, Encoding Declarations
# -*- coding: utf-8 -*-
file_name = r"Filéname"
file_name = unicode(file_name, encoding="utf-8")
1 Help on class unicode in module __builtin__: 2 3 class unicode(basestring) 4 | unicode(object='') -> unicode object 5 | unicode(string[, encoding[, errors]]) -> unicode object 6 | 7 | Create a new Unicode object from the given encoded string. 8 | encoding defaults to the current default string encoding. 9 | errors can be 'strict', 'replace' or 'ignore' and defaults to 'strict'.
正如我在之前的評論中提到的那樣,在具有 unicode 個字符的 Windows 文件系統上切換到 Python 3. Python 2 可能會是一場噩夢。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.