简体   繁体   中英

regular expression

I need to match Regular expression for

txt = "orderType not in ('connect', 'Modify', 'random', 'more')"

correct data:

txt = "orderType is in ('connect')"
txt = "orderType not in ('connect', 'Modify')"

N number of items can be inside bracket, with quotes and comma separated like above. Rest of all should not be matched, like below

txt = "orderType not in ('connect', Modify, 'ran=dom', 'more')" 
import re
pattern1 = '\w+\s+(?:is|not)\sin\s+\('
pattern2 = '\'\w+\''
pattern3 = '\s?,\s?'+pattern2+'+'
print(re.findall(pattern3, txt))
pattern6 = pattern1+pattern2
pattern5 = pattern1+pattern2+pattern3
pattern4 = (pattern2+ pattern3)  +'|'+ (pattern2 )
pattern = pattern5+ '|' + pattern6
print(re.findall(pattern,txt))

my output is ["orderType not in ('connect', 'Modify'"]

expected output should: orderType not in ('connect', 'Modify', 'random', 'more')

Be it entire line, I won't mind if it returns true for all matched and false for the rest

You are missing some parentheses. When combining expressions p1 and p2 that each match a string s1 and s2 respectively, the regexp resulting from p1+p2 will not necessarily match s1+s2 , due to the order of precedence in the regexp syntax. The following will do what you likely wanted (the changes are for 'pattern3' and 'pattern'):

import re
pattern1 = '\w+\s+(?:is|not)\sin\s+\('
pattern2 = '\'\w+\''
pattern3 = '(?:\s?,\s?'+pattern2+')+'
print(re.findall(pattern3, txt))
pattern6 = pattern1+pattern2
pattern5 = pattern1+pattern2+pattern3
pattern = '(?:'+pattern5+ ')|(?:' + pattern6+')'
print(re.findall(pattern,txt))

I only added the needed () in the regexp string, no other fixes. Note this doesn't match the closing parenthesis of the input string - add '\\s+\\) at the end, if you want to.

Try it:

import re

texts=[ "orderType not in ('connect', 'Modify', 'random', 'more')",
        "orderType is in ('connect')",
        "orderType not in ('connect', 'Modify')"
        ]

reg=re.compile( r"\s*orderType\s+(?:is|(not))\s+in\s+\(\s*'connect'\s*(?(1),\s*'Modify'\s*\)|\))" )
for txt in texts:
    m=reg.fullmatch(txt)
    print("matched -->" if m else "not matched -->",txt)

"""
Descriptions:
    (is|(not))      The external parentheses to encapsulate the '|' pattern (or);
    (?:is|(not))    ?: The encapsulated expression matching as a group, is not interesting to us as a whole.
    (not)           If matched, it will be the 1st group.
    \(              Escape sequence, it matches the '(' .
    (?(1),\s*'Modify'\s*\)|\)   Yes-no pattern (?(group_number)expr1|expr2)
"""

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM