简体   繁体   中英

How to prevent Python from escaping special characters when reading a regex from a text file?

I am reading a text file in Python that, among other things, contains pre-written regexes that will be used for matching later on. The text file is of the following format:

...

--> Task 2

Concatenate and print the strings "Hello, " and "world!" to the screen.

--> Answer

Hello, world!

print(\\"Hello,\\s\\"\\s*+\\s*\\"world!\\")

--> Hint 1

You can concatenate two strings with the + operator

...

User input is being accepted based on tasks and either executed in a subprocess to see a return value or matched against a regex. The issue, though, is that python's file.readline() will escape all special characters in the regex string (ie backslashes), giving me something that isn't useful.

I tried to read in the file as bytes and decode the lines using the 'raw_unicode_escape' argument (described as producing "a string that is suitable as raw Unicode literal in Python source code"), but no dice:

file.open(filename, 'rb')
for line in file:
  line = line.decode('raw_unicode_escape')
  ...

Am I going about this the completely wrong way?

Thanks for any and all help.

ps I found this question as well: Issue while reading special characters from file . However, I still have the same trouble when I use file.open(filename, 'r', encoding='utf-8') .

Python regex patterns are just plain old strings. There should be no problem with storing them in a file. Perhaps when you use file.readline() you are seeing escaped characters because you are looking at the repr of the line? That should not be an issue when you actually use the pattern as a regex however:

import re
filename='/tmp/test.txt'
with open(filename,'w') as f:
    f.write(r'\"Hello,\s\"\s*\+\s*\"world!\"')

with open(filename,'r') as f:
    pat = f.readline()
    print(pat)
    # \"Hello,\s\"\s*\+\s*\"world!\"
    print(repr(pat))
    # '\\"Hello,\\s\\"\\s*\\+\\s*\\"world!\\"'
    assert re.search(pat,'  "Hello, " +   "world!"')  # Shows match was found

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM