为什么我的 Unicode 中的 RegEx 模式不起作用？

Question

import re
file = open('C:\item.bh.txt', 'r', encoding = 'utf-16')
pattern = re.findall(ur'[\u09ac][\u0995]', file)

It shows the following error:它显示以下错误：

 File "<ipython-input-22-bbd94837f9ee>", line 1 pattern = re.findall(ur'[\ব][\ক]', file) ^ SyntaxError: invalid syntax

Answer 1

It doesn't make sense to have a raw unicode string here as you want the escape sequences to be interpreted.由于您希望解释转义序列，因此在此处使用原始 unicode 字符串是没有意义的。 Second re.findall takes a string, not a file, so you have to read the file.第二个re.findall需要一个字符串，而不是一个文件，所以你必须读取文件。 The character classes are also not needed because they contain only a single character.也不需要字符类，因为它们只包含一个字符。

re.findall(u'\u09ac\u0995', file.read())

Or in context:或者在上下文中：

import re
file = open(r'C:\item.bh.txt', 'r', encoding = 'utf-16')
pattern = re.findall(u'\u09ac\u0995', file.read())

为什么我的 Unicode 中的 RegEx 模式不起作用？

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-12-09 06:53:00

为什么我的 Unicode 中的 RegEx 模式不起作用？

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-12-09 06:53:00

解决方案1
1 已采纳 2019-12-09 06:53:00