繁体   English   中英

在Python中合并正则表达式

[英]Combine regular expression in Python

我是正则表达式的新手。

我正在尝试获取svstat命令中已启动或关闭的服务列表。

svstat的示例输出:

/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up

所以,目前我需要2个正则表达式来过滤UP或DOWN的服务

UP的regex-1示例:

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s\(pid\s(?P<pid>\d+)\)\s(?P<seconds>\d+)

regex-1的输出:

Match 1
status -> up
service_name -> worker-test-1
pid -> 1234
seconds -> 97381

Match 2
status -> up
service_name -> worker-test-2
pid -> 4567
seconds -> 92233

Match 3
status -> up
service_name -> worker-test-3
pid -> 8910
seconds -> 97381

DOWN的样本regex-2

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s(?P<seconds>\d+)

regex-2的输出

Match 1
status -> down
service_name -> worker-test-4
seconds -> 9

Match 2
status -> down
service_name -> worker-test-5
seconds -> 9

Match 3
status -> down
service_name -> worker-test-6
seconds -> 9

问题是,如何仅使用1个正则表达式同时获得UP和DOWN?

顺便说一下,我使用http://pythex.org/创建和测试了这些正则表达式。

您可以将pid包含在可选的非捕获组中:

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)(?:\s\(pid\s(?P<pid>\d+)\))?\s(?P<seconds>\d+)

如果服务关闭,这将导致pidNone 请参阅Regex101演示。

如这里所承诺的,我的午休方案(不想谈论固定令牌拆分解析,但是在考虑仅OP知道的其余用例时可能会派上用场;-)

#! /usr/bin/env python
from __future__ import print_function

d = """
/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up
"""


def service_state_parser_gen(text_lines):
    """Parse the lines from service monitor by splitting
    on well known binary condition (either up or down)
    and parse the rest of the fields based on fixed
    position split on sanitized data (in the up case).
    yield tuple of key and dictionary as result or of
    None, None when neihter up nor down detected."""

    token_up = ': up '
    token_down = ': down '
    path_sep = '/'

    for line in d.split('\n'):
        if token_up in line:
            chunks = line.split(token_up)
            status = token_up.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            _, pid, seconds, _ = chunks[1].replace(
                '(', '').replace(')', '').split()
            yield service, {'name': service,
                            'status': status,
                            'pid': int(pid),
                            'seconds': int(seconds)}
        elif token_down in line:
            chunks = line.split(token_down)
            status = token_down.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            pid = None
            seconds, _, _, _ = chunks[1].split()
            yield service, {'name': service,
                            'status': status,
                            'pid': None,
                            'seconds': int(seconds)}
        else:
            yield None, None


def main():
    """Sample driver for parser generator function."""

    services = {}
    for key, status_map in service_state_parser_gen(d):
        if key is None:
            print("Non-Status line ignored.")
        else:
            services[key] = status_map

    print(services)

if __name__ == '__main__':
    main()

在运行时,它在给定的样本输入上产生结果:

Non-Status line ignored.
Non-Status line ignored.
{'worker-test-1': {'status': 'up', 'seconds': 97381, 'pid': 1234, 'name': 'worker-test-1'}, 'worker-test-3': {'status': 'up', 'seconds': 97381, 'pid': 8910, 'name': 'worker-test-3'}, 'worker-test-2': {'status': 'up', 'seconds': 92233, 'pid': 4567, 'name': 'worker-test-2'}, 'worker-test-5': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-5'}, 'worker-test-4': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-4'}, 'worker-test-6': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-6'}}

因此,否则将在已命名的组中以匹配的方式存储已存储的信息(已经将类型转换为dict中匹配键下的值。如果服务关闭,则当然没有进程ID,因此pid被映射为None ,这使得编写代码变得容易以一种健壮的方式来解决它(如果将所有向下服务存储在一个隐式的单独结构中,则建议不要访问pid字段...

希望能帮助到你。 PS:是的,由于其包含的内容,对于展示函数的参数名称text_lines并不是最佳名称,但是您应该了解解析的想法。

我不知道您是否被迫使用正则表达式,但是如果您不必这样做,则可以执行以下操作:

if "down" in linetext:
    print( "is down" )
else:
    print( "is up" )

更容易阅读,也更快。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM