简体   繁体   中英

Python: Match string with multiple patterns

I have below string in variables

'''
                                 Messages  Retrans   Timeout   Unexpected-Msg
           INVITE ---------->         30        0         0
              100 <----------         30        0         0         0
              180 <----------         12        0         0         18
              200 <----------         12        0         0         0

              ACK ---------->         12        0
             INFO ---------->         12        0         0
              200 <----------         12        0         0         0
       Pause [         10.0s]         12                            0
              BYE ---------->         12        0
              200 <----------         12        0         0         0

'''

How can i get below regex match output with one pattern or min pattern.

[('INVITE', '---------->', '30', '0', '0'), ('100', '<---------- ', '30', '0', '0', '0'), ('180', '<---------- ', '12', '0', '0', '18'), ('200', '<---------- ', '12', '0', '0', '0'), ('ACK', '---------->', '12', '0'), ('INFO', '---------->', '12', '0', '0'), ('200', '<---------- ', '12', '0', '0', '0'), ('BYE', '---------->', '12', '0'), ('200', '<---------- ', '12', '0', '0', '0')].

I have used below script t get the output. +++++++++++++++++++++++++++++++++++++

import re

aa = '''
xmlSFT_Client_mscmlivr MSCML_FULL_AUDIO_Script_CA_disabled.
Warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE = 1024
Resolving remote host '10.214.13.168'... Done.
------------------------------ Scenario Screen -------- [1-9]: Change Screen --
  Call-rate(length)   Port   Total-time  Total-calls  Remote-host
  10.0(0 ms)/1.000s   4063      11.24 s           30  10.214.13.168:5060(UDP)

  Call limit reached (-m 30), 0.000 s period  0 ms scheduler resolution
  0 calls (limit 300)                    Peak was 13 calls, after 1 s
  0 Running, 31 Paused, 0 Woken up
  0 dead call msg (discarded)            0 out-of-call msg (discarded)
  2 open sockets

                                 Messages  Retrans   Timeout   Unexpected-Msg
           INVITE ---------->         30        0         0
              100 <----------         30        0         0         0
              180 <----------         12        0         0         18
              200 <----------         12        0         0         0

              ACK ---------->         12        0
             INFO ---------->         12        0         0
              200 <----------         12        0         0         0
       Pause [         10.0s]         12                            0
              BYE ---------->         12        0
              200 <----------         12        0         0         0

'''

a = []

for i in aa.split('\n'):

    if re.findall(r'^\s+(\w+)\s?(.-+.)\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)',i,re.MULTILINE):
            a.append(re.findall(r'^\s+(\w+)\s?(.-+.)\s+(\w+)\s+(\w+)\s+(\w+)\s+(\w+)',i,re.MULTILINE)[0])
    elif re.findall(r'^\s+(\w+)\s?(.-+.)\s+(\w+)\s+(\w+)\s+(\w+)',i,re.MULTILINE) :
            a.append(re.findall(r'^\s+(\w+)\s?(.-+.)\s+(\w+)\s+(\w+)\s+(\w+)',i,re.MULTILINE)[0])
    elif re.findall(r'^\s+(\w+)\s?(.-+.)\s+(\w+)\s+(\w+)',i,re.MULTILINE):
            a.append(re.findall(r'^\s+(\w+)\s?(.-+.)\s+(\w+)\s+(\w+)',i,re.MULTILINE)[0])

print a

++++++++++++++++++++++++++++++++++++++

So first of all, you haven't format your question well. I try to format it for you but you need more text to explain.

basically from my understanding, the question is having a string a :

a = '''
  2 open sockets

                                 Messages  Retrans   Timeout   Unexpected-Msg
           INVITE ---------->         30        0         0
              100 <----------         30        0         0         0
              180 <----------         12        0         0         18
              200 <----------         12        0         0         0

              ACK ---------->         12        0
             INFO ---------->         12        0         0
              200 <----------         12        0         0         0
       Pause [         10.0s]         12                            0
              BYE ---------->         12        0
              200 <----------         12        0         0         0

'''

you wanted to get all the result like this: ('INVITE', '---------->', '30', '0', '0'),

I used the following regex line to achieve:

import re

a = '''
  2 open sockets

                                 Messages  Retrans   Timeout   Unexpected-Msg
           INVITE ---------->         30        0         0
              100 <----------         30        0         0         0
              180 <----------         12        0         0         18
              200 <----------         12        0         0         0

              ACK ---------->         12        0
             INFO ---------->         12        0         0
              200 <----------         12        0         0         0
       Pause [         10.0s]         12                            0
              BYE ---------->         12        0
              200 <----------         12        0         0         0

'''
pattern = re.compile(r'(?P<type>\d+|\w+)\s*(?P<dir>\<?\-+\>?)\s+(?P<messages>\d*)[\n\r\s]*(?P<retrans>\d*)[\n\r\s]*(?P<timeout>\d*)[\n\r\s]*(?P<unexpected_msg>\d*)\n',flags=re.MULTILINE)
result = pattern.finditer(a)
result_list = [m.groupdict() for m in result]

output for result:

...:for i in result_list:
    print(i)

...:
{'type': 'INVITE', 'dir': '---------->', 'messages': '30', 'retrans': '0', 'timeout': '0', 'unexpected_msg': ''}
{'type': '100', 'dir': '<----------', 'messages': '30', 'retrans': '0', 'timeout': '0', 'unexpected_msg': '0'}
{'type': '180', 'dir': '<----------', 'messages': '12', 'retrans': '0', 'timeout': '0', 'unexpected_msg': '18'}
{'type': '200', 'dir': '<----------', 'messages': '12', 'retrans': '0', 'timeout': '0', 'unexpected_msg': '0'}
{'type': 'ACK', 'dir': '---------->', 'messages': '12', 'retrans': '0', 'timeout': '', 'unexpected_msg': ''}
{'type': 'INFO', 'dir': '---------->', 'messages': '12', 'retrans': '0', 'timeout': '0', 'unexpected_msg': ''}
{'type': '200', 'dir': '<----------', 'messages': '12', 'retrans': '0', 'timeout': '0', 'unexpected_msg': '0'}
{'type': 'BYE', 'dir': '---------->', 'messages': '12', 'retrans': '0', 'timeout': '', 'unexpected_msg': ''}
{'type': '200', 'dir': '<----------', 'messages': '12', 'retrans': '0', 'timeout': '0', 'unexpected_msg': '0'}

then you can iterate the result_list .

Instead of using list of tuple as you mentioned in the question, because i think you may wants to know what value is missing. ('ACK', '---------->', '12', '0') like this may not tell you which 2 value is missing because i saw at line pause there is only message value and unexpected-msg value. Pause [ 10.0s] 12 0

hope this is what you are looking for and please add more description to the question so that you can format your code nicely.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM