如何在 Python 中打印帶有雙反斜杠的特殊字符（如 \\xe7）的字符串

Question

我有一個包含特殊字符的字符串（從 HTML web 頁面請求獲得）：

'Dimarts, 10 Mar\\xe7 2020'

如果我打印此字符串，它會正確轉義雙反斜杠並僅打印一個：

Dimarts, 10 Mar\xe7 2020

但我想要的是打印真正的字符，即字符 92 = ç

Dimarts, 10 Març 2020

我嘗試用單個反斜杠替換雙反斜杠，甚至用 html 庫取消轉義，但沒有成功。 如果我用文本手動設置一個新變量，然后打印它，它可以工作：

print('Original: ', repr(text))
print('Direct  : ', text)
print('Option 1: ', text.replace('\\\\', '\\'))
print('Option 2: ', text.replace(r'\\', '\\'))
print('Option 3: ', text.replace(r'\\', chr(92)))
print('Option 4: ', text.replace('\\', chr(92)))
print('Option 5: ', html.unescape(text))
text = 'Dimarts, 10 Mar\xe7 2020'
print('Manual:   ', text)

結果永遠不會像預期的那樣：

Original:  'Dimarts, 10 Mar\\xe7 2020'
Direct  :  Dimarts, 10 Mar\xe7 2020
Option 1:  Dimarts, 10 Mar\xe7 2020
Option 2:  Dimarts, 10 Mar\xe7 2020
Option 3:  Dimarts, 10 Mar\xe7 2020
Option 4:  Dimarts, 10 Mar\xe7 2020
Option 5:  Dimarts, 10 Mar\xe7 2020
Manual:    Dimarts, 10 Març 2020

有沒有辦法告訴 Python 正確處理特殊字符？

Answer 1

不確定這是否是您想要的，但是：

print(chr(231))

將打印您想要的字符。

它還將由以下人員打印：

print(u"\xe7")

Answer 2

好吧，事實證明我在 Windows 中的文件編碼時遇到了問題。 我必須在處理之前對其進行解碼。 所以，這樣做解決了這個問題：

htmlfile = urllib.request.urlopen('http://www.somewebpage.com/')
for line in htmlfile:
    line = line.decode('cp1252')

也可以解碼整個 html：

htmlfile = urllib.request.urlopen('http://www.somewebpage.com/').read()
htmldecoded = htmlfile.decode('cp1252')

這樣做解決了問題，我可以正確打印字符串。

如何在 Python 中打印帶有雙反斜杠的特殊字符（如 \\xe7）的字符串

問題描述

2 個解決方案

解決方案1
0 2020-04-16 11:02:59

解決方案2
0 已采納 2020-04-18 11:24:20

如何在 Python 中打印帶有雙反斜杠的特殊字符（如 \\xe7）的字符串

問題描述

2 個解決方案

解決方案1 0 2020-04-16 11:02:59

解決方案2 0 已采納 2020-04-18 11:24:20

解決方案1
0 2020-04-16 11:02:59

解決方案2
0 已采納 2020-04-18 11:24:20