简体   繁体   中英

Python regex : Difference in usage of re.sub, re.match & re.search for whitelisting

All 3 statements below allow only alphabets, numbers, underscore & hyphen to pass through. Is there a difference between using re.sub , re.match & re.search below? ie is it possible to have a value for str where the execution paths of the `if statement below might be different for any of them?

str = 'some-random-string *&- '

if re.sub(r'[^a-zA-Z0-9_-]', '',  str) == str:
    #do stuff

if re.match(r'[a-zA-Z0-9_-]+$', str):
    #do stuff

if re.search(r'^[a-zA-Z0-9_-]+$', str):
    #do stuff

Using re.sub you get a new string and check it's not equal to what it was to detect if something was removed - that's not exactly performant.

Using re.search with the ^ to anchor the beginning of a match is the same as using re.match .

Using re.match is much more explicit of what you're trying to achieve, it has to match the pattern otherwise it's not valid - it can also shortcut early...

In short - stick with re.match for your purposes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM