繁体   English   中英

正则表达式匹配(命名)组以随机顺序显示(Python重新)

[英]RegEx match (named) groups that show up in random order (Python re)

我正在尝试匹配下面的regEx中列出的RegEx命名组( preArgsapm1ArgsmidArgsapm2ArgspostArgs ),它们以随机顺序出现。
我能够匹配测试String1中 ,而不是测试String2的如下:

我需要满足以下要求:

1.每个组可能都出现1个或更多 (由于剩下的垃圾); 或完全不存在 ...

2.除唯一的javaagent罐子外,每个apm1Argsapm2args始终带有1个或多个-D开关。

我尝试了一些OR(|)选项,(?=)正面看待,但没有运气,迷宫中也迷路了...

RegEx (可在: regex101.com上列出的RegEx上获得

^(?P<preArgs>.*)(?P<apm1Args>-javaagent:.+\/agent1\.jar\s+(?:-Dvendor1\.agent1\.\S+\s*)*)(?P<midArgs>.*)(?P<apm2Args>-javaagent:.+\/agent2\.jar\s+(?:-Dvendor2\.agent2\.\S+\s*)*)(?P<postArgs>.*)$

测试字符串1

-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2

测试String2 (可用于: 具有不同regex101.com链接的相同RegEx

-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon

更新:

我最终在python中使用了“循环”方法来清理以随机顺序显示或根本不显示的“ apmArgs”组。 以下是我的代码段(也可以在repl.it进行测试)

import os, sys, re

regExArr=[
  '(?P<preArgs>.*)(?P<apmArgs>-javaagent:\s*\/\S+agent1\.jar\s+(?:-Dvendor1\.agent1\.\S+\s*)*)(?P<postArgs>.*)'
,'(?P<preArgs>.*)(?P<apmArgs>-javaagent:\s*\/\S+agent2\.jar\s+(?:-Dvendor2\.agent2\.\S+\s*)*)(?P<postArgs>.*)'
]

testStrList=[
  '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2'
,'-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1'
,'-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon'
,'-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -Xgcpolicy:gencon'
]

newApmArgs='-javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13'

for i, testStr in enumerate(testStrList):

    for regEx in regExArr:

        matchedArgs = re.search(regEx,testStr)

        while matchedArgs:

          print "matchedArgs found count:", len(matchedArgs.groups())
          print "matchedArgs found:\n", matchedArgs.groups()
          #ignore any <apmArgs> group and concatenate other groups
          testStr=(matchedArgs.group('preArgs').strip()+' '+matchedArgs.group('postArgs').strip()).strip()
          #check further for leftover <apmArgs> and repeat the clean-up
          matchedArgs = re.search(regEx,testStr)

    testStrList[i] = testStr + ' ' + newApmArgs

print "cleaned up list testStrList that had Random groups of APM Args Text (now appended with 3rd type APM Args) is:\n", testStrList

输出:

matchedArgs found count: 3
matchedArgs found:
('-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 ', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 ', '-Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2')
matchedArgs found count: 3
matchedArgs found:
('', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1 ', '-Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2')
matchedArgs found count: 3
matchedArgs found:
('-Xgcpolicy:gencon ', '-javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/path2/to/profiles/agent2.profile -Dvendor2.agent2.customValue1=myValue2', '')
matchedArgs found count: 3
matchedArgs found:
('-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 ', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName=myNode1 -Dvendor1.agent1.uniqueHostId=myHost1', '')
matchedArgs found count: 3
matchedArgs found:
('-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 ', '-javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 ', '-Xgcpolicy:gencon')
matchedArgs found count: 3
matchedArgs found:
('-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 ', '-javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 ', '-Xgcpolicy:gencon')
cleaned up list testStrList that had Random groups of APM Args Text (now appended with 3rd type APM Args) is:
['-Xgcpolicy:gencon -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13', '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13', '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -Xgcpolicy:gencon -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13', '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -Xgcpolicy:gencon -javaagent:/path3/to/agent3.jar -Dvendor3.agent3.applicationName=app1234 -Dvendor3.agent3.tierName=myTier13 -Dvendor3.agent3.nodeName=myNode13 -Dvendor3.agent3.uniqueHostId=myHost13']

您可能会发现pyparsing方法可以使您更快地进行正则表达式处理。 这是一个解析器,将处理您的两个测试字符串:

import pyparsing as pp

# just some punctuation
COLON,EQ = map(pp.Suppress, ':=')

# expressions for key=value,... switches
subkey = pp.Word(pp.alphas)
subvalue = pp.pyparsing_common.integer | pp.Word(pp.printables, excludeChars=',')
key_value_list = pp.Dict(pp.delimitedList(pp.Group(subkey + EQ + subvalue)))

# parse switches
switch_key = pp.Word('-', pp.alphas).setParseAction(lambda t: t[0][1:].lower())
switch_value = key_value_list | subvalue
switch = switch_key + pp.Optional(COLON + switch_value)

# -D definitions
java_path_name = pp.delimitedList(pp.pyparsing_common.identifier, delim='.', combine=True)
defn = (pp.Suppress("-D") +  java_path_name.leaveWhitespace()
        + EQ.leaveWhitespace() 
        + pp.Optional(subvalue().leaveWhitespace()))

# define parser for the entire line - use Dict class to define dynamic key-value structures instead of just 2-tuples
parser = pp.Dict(pp.OneOrMore(pp.Group(defn | switch)))

tests = """\
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2
-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon
"""
parser.runTests(tests)

打印:

-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2
[['xdebug'], ['xnoagent'], ['xrunjdwp', ['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]], ['javaagent', '/path1/to/agent1.jar'], ['vendor1.agent1.applicationName', 'app123'], ['vendor1.agent1.tierName', 'myTier1'], ['vendor1.agent1.nodeName'], ['vendor1.agent1.uniqueHostId', 'myHost1'], ['xgcpolicy', 'gencon'], ['javaagent', '/path2/to/vendor2/agent2.jar'], ['vendor2.agent2.agentProfile', '/metlife/runtime/installed/apm/profiles/csa.profile'], ['vendor2.agent2.customValue1', 'myValue2']]
- javaagent: '/path2/to/vendor2/agent2.jar'
- vendor1.agent1.applicationName: 'app123'
- vendor1.agent1.nodeName: ''
- vendor1.agent1.tierName: 'myTier1'
- vendor1.agent1.uniqueHostId: 'myHost1'
- vendor2.agent2.agentProfile: '/metlife/runtime/installed/apm/profiles/csa.profile'
- vendor2.agent2.customValue1: 'myValue2'
- xdebug: ''
- xgcpolicy: 'gencon'
- xnoagent: ''
- xrunjdwp: [['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]]
  - address: 7777
  - server: 'y'
  - suspend: 'y'
  - transport: 'dt_socket'


-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=7777 -javaagent:/path2/to/vendor2/agent2.jar -Dvendor2.agent2.agentProfile=/metlife/runtime/installed/apm/profiles/csa.profile -Dvendor2.agent2.customValue1=myValue2 -javaagent:/path1/to/agent1.jar -Dvendor1.agent1.applicationName=app123 -Dvendor1.agent1.tierName=myTier1 -Dvendor1.agent1.nodeName= -Dvendor1.agent1.uniqueHostId=myHost1 -Xgcpolicy:gencon
[['xdebug'], ['xnoagent'], ['xrunjdwp', ['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]], ['javaagent', '/path2/to/vendor2/agent2.jar'], ['vendor2.agent2.agentProfile', '/metlife/runtime/installed/apm/profiles/csa.profile'], ['vendor2.agent2.customValue1', 'myValue2'], ['javaagent', '/path1/to/agent1.jar'], ['vendor1.agent1.applicationName', 'app123'], ['vendor1.agent1.tierName', 'myTier1'], ['vendor1.agent1.nodeName'], ['vendor1.agent1.uniqueHostId', 'myHost1'], ['xgcpolicy', 'gencon']]
- javaagent: '/path1/to/agent1.jar'
- vendor1.agent1.applicationName: 'app123'
- vendor1.agent1.nodeName: ''
- vendor1.agent1.tierName: 'myTier1'
- vendor1.agent1.uniqueHostId: 'myHost1'
- vendor2.agent2.agentProfile: '/metlife/runtime/installed/apm/profiles/csa.profile'
- vendor2.agent2.customValue1: 'myValue2'
- xdebug: ''
- xgcpolicy: 'gencon'
- xnoagent: ''
- xrunjdwp: [['transport', 'dt_socket'], ['server', 'y'], ['suspend', 'y'], ['address', 7777]]
  - address: 7777
  - server: 'y'
  - suspend: 'y'
  - transport: 'dt_socket'

以下是一些示例代码,用于访问已解析的字段:

t0 = tests.splitlines()[0]
result = parser.parseString(t0)
print(result.xrunjdwp.address)
print(result['vendor1.agent1.applicationName'])

打印:

7777
app123

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM