简体   繁体   English

正则表达式

[英]regular expression

I need to match Regular expression for 我需要匹配正则表达式

txt = "orderType not in ('connect', 'Modify', 'random', 'more')"

correct data: 正确的数据:

txt = "orderType is in ('connect')"
txt = "orderType not in ('connect', 'Modify')"

N number of items can be inside bracket, with quotes and comma separated like above. 括号内可以包含N个项目,引号和逗号如上分隔。 Rest of all should not be matched, like below 其余的都不应该匹配,如下所示

txt = "orderType not in ('connect', Modify, 'ran=dom', 'more')" 
import re
pattern1 = '\w+\s+(?:is|not)\sin\s+\('
pattern2 = '\'\w+\''
pattern3 = '\s?,\s?'+pattern2+'+'
print(re.findall(pattern3, txt))
pattern6 = pattern1+pattern2
pattern5 = pattern1+pattern2+pattern3
pattern4 = (pattern2+ pattern3)  +'|'+ (pattern2 )
pattern = pattern5+ '|' + pattern6
print(re.findall(pattern,txt))

my output is ["orderType not in ('connect', 'Modify'"] 我的输出是["orderType not in ('connect', 'Modify'"]

expected output should: orderType not in ('connect', 'Modify', 'random', 'more') 预期的输出应为: orderType not in ('connect', 'Modify', 'random', 'more')

Be it entire line, I won't mind if it returns true for all matched and false for the rest 就整行而言,我不介意所有匹配项返回true,其余返回false

You are missing some parentheses. 您缺少一些括号。 When combining expressions p1 and p2 that each match a string s1 and s2 respectively, the regexp resulting from p1+p2 will not necessarily match s1+s2 , due to the order of precedence in the regexp syntax. 当组合分别与字符串s1s2匹配的表达式p1p2 ,由于regexp语法中的优先顺序,由p1+p2生成的regexp不一定与s1+s2匹配。 The following will do what you likely wanted (the changes are for 'pattern3' and 'pattern'): 以下将完成您可能想要的操作(更改针对“ pattern3”和“ pattern”):

import re
pattern1 = '\w+\s+(?:is|not)\sin\s+\('
pattern2 = '\'\w+\''
pattern3 = '(?:\s?,\s?'+pattern2+')+'
print(re.findall(pattern3, txt))
pattern6 = pattern1+pattern2
pattern5 = pattern1+pattern2+pattern3
pattern = '(?:'+pattern5+ ')|(?:' + pattern6+')'
print(re.findall(pattern,txt))

I only added the needed () in the regexp string, no other fixes. 我只在regexp字符串中添加了所需的(),没有其他修复方法。 Note this doesn't match the closing parenthesis of the input string - add '\\s+\\) at the end, if you want to. 请注意,这与输入字符串的右括号不匹配-如果需要,请在末尾添加'\\s+\\)

Try it: 试试吧:

import re

texts=[ "orderType not in ('connect', 'Modify', 'random', 'more')",
        "orderType is in ('connect')",
        "orderType not in ('connect', 'Modify')"
        ]

reg=re.compile( r"\s*orderType\s+(?:is|(not))\s+in\s+\(\s*'connect'\s*(?(1),\s*'Modify'\s*\)|\))" )
for txt in texts:
    m=reg.fullmatch(txt)
    print("matched -->" if m else "not matched -->",txt)

"""
Descriptions:
    (is|(not))      The external parentheses to encapsulate the '|' pattern (or);
    (?:is|(not))    ?: The encapsulated expression matching as a group, is not interesting to us as a whole.
    (not)           If matched, it will be the 1st group.
    \(              Escape sequence, it matches the '(' .
    (?(1),\s*'Modify'\s*\)|\)   Yes-no pattern (?(group_number)expr1|expr2)
"""

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM