简体   繁体   中英

Searching for a string containing literal brackets with a Python Regular Expression

Date = re.search('%s(.*)%s' % ("DateCreated:", "] [TotalTime:"), find_all(Text("Exam"))[0].value).group(1)

I am getting an error "unexpected end of regular expression". My guess is it is not accepting "] [" section in code.

Use re.escape() to escape a string such that it can be used as a literal in a regular expression.

Observe:

With contents escaped

>>> re.search(re.escape('] ['), 'foo ] [ bar')
<_sre.SRE_Match object at 0x105a956b0>

Without contents escaped

>>> re.search('] [', 'foo ] [ bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 146, in search
    return _compile(pattern, flags).search(string)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 251, in _compile
    raise error, v # invalid expression
sre_constants.error: unexpected end of regular expression

Escaping Only Literal Components

In your immediate case, of course, you want to escape only the two literal strings you're searching between:

re.search('%s(.*)%s' % (re.escape("DateCreated:"),
                        re.escape("] [TotalTime:")),
          "DateCreated: yadda yadda ] [TotalTime: meh")

...by the way, notice how much easier proper indentation makes readability? You might think about doing that yourself in the future, or using an editor (such as emacs) which will do it for you.

Special characters should be escaped when using within regex pattern:

1) in a direct way:

Date = re.search(r'%s(.*)%s' % ("DateCreated:", "\] \[TotalTime:"), 'DateCreated: 04-01-2017 ] [TotalTime: 2')
print(Date.group(1))  # 04-01-2017

2) OR by using re.escape() function(which is preferable):

Date = re.search(r'%s(.*)%s' % (re.escape("DateCreated:"), re.escape("] [TotalTime:")), 'DateCreated: 04-01-2017 ] [TotalTime: 2')
print(Date.group(1)) # 04-01-2017

https://docs.python.org/3/library/re.html#re.escape

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM