简体   繁体   English

通过正则表达式捕获多行字符串中的空列

[英]Capture empty columns in multiline string via regex

Can someone help me to write a regex for the following, I need to create a regex which catchup value in all of the below colums as well as the empty ones, which could come as ''有人可以帮我为以下内容编写一个正则表达式,我需要创建一个正则表达式,该正则表达式可以在以下所有列以及空列中追赶值,这些列可以是 ''

I am new to regex but I tried many thing as per my knowledge but coulnt figure it out.我是正则表达式的新手,但据我所知,我尝试了很多东西,但无法弄清楚。

string = '''
Local Address   Port  Port Type  Probes     Drops      CtrlProbes Discard Protocol SendersCnt 
1.1.1.1         777   Permanent  9854579    0          9854677                                
2.2.2.2         15000 Dynamic    6569029    1          656905     ON      IPSLA    2          
3.3.3.3         15000 Dynamic    6569003    0          656903     OFF     IPSLA    2          
4.4.4.4         15000 Dynamic    6569029    0          656904     ON      IPSLA    2          
5.5.5.5         15000 Dynamic    1259435    0          125945     ON      IPSLA    2           
'''

I want to achieve this output我想实现这个 output

{
    'local_addr': '1.1.1.1',
    'port': '777',
    'port_type': 'Permanent',
    'probes': '6569029',
    'drops': '1',
    'ctrl_probes': '656905',
    'discard': '',
    'protocol': '',
    'sender_cnt': ''
}
{
    'local_addr': '2.2.2.2',
    'port': '15000',
    'port_type': 'Dynamic',
    'probes': '6569029',
    'drops': '1',
    'ctrl_probes': '656905',
    'discard': 'ON',
    'protocol': 'IPSLA',
    'sender_cnt': '2'
}
{
    'local_addr': '3.3.3.3',
    'port': '15000',
    'port_type': 'Dynamic',
    'probes': '6569003',
    'drops': '0',
    'ctrl_probes': '656903',
    'discard': 'OFF',
    'protocol': 'IPSLA',
    'sender_cnt': '2'
}
{
    'local_addr': '4.4.4.4',
    'port': '15000',
    'port_type': 'Dynamic',
    'probes': '6569029',
    'drops': '0',
    'ctrl_probes': '656904',
    'discard': 'ON',
    'protocol': 'IPSLA',
    'sender_cnt': '2'
}
{
    'local_addr': '5.5.5.5',
    'port': '15000',
    'port_type': 'Dynamic',
    'probes': '1259435',
    'drops': '0',
    'ctrl_probes': '125945',
    'discard': 'ON',
    'protocol': 'IPSLA',
    'sender_cnt': '2'
}

this is best one I could up come but again not good as it is going over to the next line and make next ip as the other group.这是我能想到的最好的,但又不好,因为它会转到下一行,并将下一个 ip 作为另一组。

my_regex = r^(?P<local_addr>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+(?P<port>\d+)\s+(?P<port_type>[A-Z]\w+)\s+(?P<probes>\d+)\s+(?P<drops>\d+)\s+(?P<ctrl_probes>\S*)\s*(?P<discard>\S+)\s*(?P<protocol>IPSLA|\s+)\s+(?P<sender_cnt>\d+)\s+

I was using re module in python and doing the match using re.match我在 python 中使用 re 模块并使用 re.match 进行匹配

Let me know if you need more information.如果您需要更多信息,请与我们联系。

Thanks in advance提前致谢

You do not need regex.你不需要正则表达式。 Do the following:请执行下列操作:

import pandas as pd, io

(pd.read_table(io.StringIO(string.replace('Local Address', 'Local_Address').
     replace('Port Type', 'Port_Type')), sep = ' +', engine = 'python').
     replace({None:''}).to_dict('records'))

Out[]: 
[{'Local_Address': '1.1.1.1',
  'Port': 666,
  'Port_Type': 'Permanent',
  'Probes': 9854579,
  'Drops': 0,
  'CtrlProbes': 9854677,
  'Discard': '',
  'Protocol': '',
  'SendersCnt': ''},
 {'Local_Address': '2.2.2.2',
  'Port': 17000,
  'Port_Type': 'Dynamic',
  'Probes': 6569029,
  'Drops': 1,
  'CtrlProbes': 656905,
  'Discard': 'ON',
  'Protocol': 'IPSLA',
  'SendersCnt': 2.0},
 {'Local_Address': '3.3.3.3',
  'Port': 17000,
  'Port_Type': 'Dynamic',
  'Probes': 6569003,
  'Drops': 0,
  'CtrlProbes': 656903,
  'Discard': 'OFF',
  'Protocol': 'IPSLA',
  'SendersCnt': 2.0},
 {'Local_Address': '4.4.4.4',
  'Port': 17000,
  'Port_Type': 'Dynamic',
  'Probes': 6569029,
  'Drops': 0,
  'CtrlProbes': 656904,
  'Discard': 'ON',
  'Protocol': 'IPSLA',
  'SendersCnt': 2.0}]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM