简体   繁体   中英

Python regex only working for substring matches but not the whole string

I am trying to remove all bracketed and parenthetical text. I am using the regex

re.sub(r'\\(.*\\) | \\[.*\\]', '', text)

This works for things like:

import re
text = 'the (quick) brown fox jumps over the [lazy] dog'
print re.sub(r'\(.*\) | \[.*\]', '', text)

> the brown fox jumps over the dog

text = '(the quick) brown fox jumps over the [lazy] dog'
print re.sub(r'\(.*\) | \[.*\]', '', text)

> brown fox jumps over the dog

But it fails when the entire string matches the regex

text = '[the quick brown fox jumps over the lazy dog]'
print re.sub(r'\(.*\) | \[.*\]', '', text)

> [the quick brown fox jumps over the lazy dog]

> # This should be '' (the empty string) #

Where am I going wrong?

you have extra space over the regex, just need to remove the space before and after |

re.sub(r'\(.*\)|\[.*\]', '', text)

or make them an optional match to match your existing output

re.sub(r'\(.*\)\s?|\s?\[.*\]', '', text)

You have an extra space that it is trying to match :)

Try:

re.sub(r'\(.*\)|\[.*\]', '', text)

A good place to test when regex does weird stuff like this is here . It's a nice interactive way to see what's going wrong. For ex. in your case, it didn't match "(pace)" but matched "(pace) " as soon as I put a space after it.

Note:

As I mentioned in the comment, be aware that the greedy match might do unexpected things if you have a random ")" in your text that may just be a standalone symbol. Consider the reluctant matching instead:

re.sub(r'\(.*?\)|\[.*?\]', '', text)

which would turn:

This is a (small) sample text with a ) symbol" ===> "This is a sample text with a ) symbol"

whereas yours currently would give:

This is a (small) sample text with a ) symbol" ===> "This is a symbol"
import re
text = '''[the quick brown fox jumps over the lazy dog]
the (quick) brown fox jumps over the [lazy] dog
(the quick) brown fox jumps over the [lazy] dog'''
print (re.sub(r'[(\[].+?[)\]]', '', text))

out:

the  brown fox jumps over the  dog
 brown fox jumps over the  dog

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM