如何在从文本文件中读取正则表达式时阻止Python转义特殊字符？

Question

I am reading a text file in Python that, among other things, contains pre-written regexes that will be used for matching later on. 我正在阅读Python中的一个文本文件，其中包含预先编写的正则表达式，稍后将用于匹配。 The text file is of the following format: 文本文件具有以下格式：

... ...

--> Task 2 - >任务2

Concatenate and print the strings "Hello, " and "world!" 连接并打印字符串“Hello”和“world！” to the screen. 到屏幕。

--> Answer - >回答

Hello, world! 你好，世界！

print(\\"Hello,\\s\\"\\s*+\\s*\\"world!\\") 打印（\\ “你好，\\ s \\” 的\\ S * + \\ S * \\ “的世界！\\”）

--> Hint 1 - >提示1

You can concatenate two strings with the + operator 您可以使用+运算符连接两个字符串

... ...

User input is being accepted based on tasks and either executed in a subprocess to see a return value or matched against a regex. 正在根据任务接受用户输入，并在子进程中执行以查看返回值或与正则表达式匹配。 The issue, though, is that python's file.readline() will escape all special characters in the regex string (ie backslashes), giving me something that isn't useful. 但问题是，python的file.readline（）将转义正则表达式字符串中的所有特殊字符（即反斜杠），这给了我一些无用的东西。

I tried to read in the file as bytes and decode the lines using the 'raw_unicode_escape' argument (described as producing "a string that is suitable as raw Unicode literal in Python source code"), but no dice: 我尝试在文件中读取字节并使用'raw_unicode_escape'参数解码行（描述为生成“适合作为Python源代码中的原始Unicode文字的字符串”），但没有骰子：

file.open(filename, 'rb')
for line in file:
  line = line.decode('raw_unicode_escape')
  ...

Am I going about this the completely wrong way? 我是以完全错误的方式来做这件事的吗？

Thanks for any and all help. 感谢您的帮助。

ps I found this question as well: Issue while reading special characters from file . ps我也发现了这个问题：从文件中读取特殊字符时出现问题。 However, I still have the same trouble when I use file.open(filename, 'r', encoding='utf-8') . 但是，当我使用file.open(filename, 'r', encoding='utf-8')时，我仍然遇到同样的问题。

Answer 1

Python regex patterns are just plain old strings. Python正则表达式模式只是普通的旧字符串。 There should be no problem with storing them in a file. 将它们存储在文件中应该没有问题。 Perhaps when you use file.readline() you are seeing escaped characters because you are looking at the repr of the line? 也许当你使用file.readline()你会看到转义字符，因为你正在查看该行的repr ？ That should not be an issue when you actually use the pattern as a regex however: 当您实际使用该模式作为正则表达式时，这应该不是问题：

import re
filename='/tmp/test.txt'
with open(filename,'w') as f:
    f.write(r'\"Hello,\s\"\s*\+\s*\"world!\"')

with open(filename,'r') as f:
    pat = f.readline()
    print(pat)
    # \"Hello,\s\"\s*\+\s*\"world!\"
    print(repr(pat))
    # '\\"Hello,\\s\\"\\s*\\+\\s*\\"world!\\"'
    assert re.search(pat,'  "Hello, " +   "world!"')  # Shows match was found

如何在从文本文件中读取正则表达式时阻止Python转义特殊字符？

问题描述

1 个解决方案

解决方案1
4 已采纳 2011-11-05 20:23:23

如何在从文本文件中读取正则表达式时阻止Python转义特殊字符？

问题描述

1 个解决方案

解决方案1 4 已采纳 2011-11-05 20:23:23

解决方案1
4 已采纳 2011-11-05 20:23:23