简体   繁体   中英

Python regex matches string it shouldn't

I'm totally lost at how this regex matches this string in python. Could someone make sense of it please?

import re
regex = "^PHP/5.\\{3|2\\}.\\{1|2|3|4|5|6|7|8|9|0\\}\\{1|2|3|4|5|6|7|8|9|0\\}$"
ua = 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)'
re.compile(regex).search(ua)

The regex starts with PHP , while the string does not. Shouldn't that simply disqualify a match from happening?

You need grouping (preferably non-capturing) for your alternation:

PHP/5.\\{(?:3|2)\\}.\\{(?:1|2|3|4|5|6|7|8|9|0)\\}\\{(?:1|2|3|4|5|6|7|8|9|0)\\}$
         ^^    ^       ^^                    ^      ^^                    ^

Other wise you will be alternating the entire expression:

  • PHP/5.\\\\{3 or
  • 2 or
  • \\\\}.\\\\{1 or
  • 2 or
  • 3 or
  • 4 or
  • 5 match found!

Think PEMDAS and nested conditionals ( if(a && (b || c)) { } ).

Your RegEx fails, because | plays an important role here. So, your string is matched for items, like this

  • ^PHP/5.\\\\{3

  • 2\\\\}.\\\\{

and so on. Since the or matches 5 in 4|5|6 , it actually matches the 5 in Mozilla/5.0 .

You can see online demo and explanation for the same, here .

正则表达式可视化

Debuggex Demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM