如何将包含Unicode转义\\ u ####的字符串转换为utf-8字符串

Question

I am trying this since morning. 我从早上开始就在尝试这个。

My sample.txt 我的sample.txt

choice = \u9078\u629e

Code: 码：

with open('sample.txt', encoding='utf-8') as f:
    for line in f:
        print(line)
        print("選択" in line)
        print(line.encode('utf-8').decode('utf-8'))
        print(line.encode().decode('utf-8'))
        print(line.encode('utf-8').decode())
        print(line.encode().decode('unicode-escape').encode("latin-1").decode('utf-8')) # as suggested.

out:
choice = \u9078\u629e
False
choice = \u9078\u629e
choice = \u9078\u629e
choice = \u9078\u629e
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 9-10: ordinal not in range(256)

When I do this in ipython qtconsole: 当我在ipython qtconsole中执行此操作时：

In [29]: "choice = \u9078\u629e"
Out[29]: 'choice = 選択'

So the question is how can I read the text file containing the unicode escaped string like \選\択 (I don't know exactly what it's called) and convert it to utf-8 like 選択 ? 所以问题是如何读取包含\選\択类的Unicode逸出字符串的文本文件（我确切不知道它的名字）并将其转换为utf-8（如選択 ？

Answer 1

If you read it from a file, just give the encoding when opening: 如果您从文件中读取它，只需在打开时提供编码：

with open('test.txt', encoding='unicode-escape') as f:    
    a = f.read()
print(a)

# choice = 選択

with test.txt containing: 与test.txt包含：

choice = \選\択 选择= \\ u9078 \\ u629e

If you already had your text in a string, you could have converted it like this: 如果您已经在字符串中输入了文本，则可以这样进行转换：

a = "choice = \\u9078\\u629e"
a.encode().decode('unicode-escape')
# 'choice = 選択'

如何将包含Unicode转义\\ u ####的字符串转换为utf-8字符串

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-03-16 08:18:16

如何将包含Unicode转义\\ u ####的字符串转换为utf-8字符串

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-03-16 08:18:16

解决方案1
2 已采纳 2018-03-16 08:18:16