简体   繁体   English

在Python中合并正则表达式

[英]Combine regular expression in Python

I'm newbie in regular expression. 我是正则表达式的新手。

I'm trying to get list of service that are up or down in svstat command. 我正在尝试获取svstat命令中已启动或关闭的服务列表。

Example output of svstat: svstat的示例输出:

/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up

So, currently I need 2 regex to filter service that are UP, or DOWN 所以,目前我需要2个正则表达式来过滤UP或DOWN的服务

Sample regex-1 for UP: UP的regex-1示例:

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s\(pid\s(?P<pid>\d+)\)\s(?P<seconds>\d+)

Output for regex-1: regex-1的输出:

Match 1
status -> up
service_name -> worker-test-1
pid -> 1234
seconds -> 97381

Match 2
status -> up
service_name -> worker-test-2
pid -> 4567
seconds -> 92233

Match 3
status -> up
service_name -> worker-test-3
pid -> 8910
seconds -> 97381

Sample regex-2 for DOWN DOWN的样本regex-2

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s(?P<seconds>\d+)

Output for regex-2 regex-2的输出

Match 1
status -> down
service_name -> worker-test-4
seconds -> 9

Match 2
status -> down
service_name -> worker-test-5
seconds -> 9

Match 3
status -> down
service_name -> worker-test-6
seconds -> 9

Question is, how to use only 1 regex to get both UP and DOWN? 问题是,如何仅使用1个正则表达式同时获得UP和DOWN?

By the way, Im using http://pythex.org/ to create and test these regex. 顺便说一下,我使用http://pythex.org/创建和测试了这些正则表达式。

You could enclose pid to optional non-capturing group: 您可以将pid包含在可选的非捕获组中:

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)(?:\s\(pid\s(?P<pid>\d+)\))?\s(?P<seconds>\d+)

This would result pid being None in case service is down. 如果服务关闭,这将导致pidNone See Regex101 demo. 请参阅Regex101演示。

As promised here my lunchbreak alternative (do not want to talk into fixed token split parsing, but might come in handy when considering the rest of the use case that only the OP knows ;-) 如这里所承诺的,我的午休方案(不想谈论固定令牌拆分解析,但是在考虑仅OP知道的其余用例时可能会派上用场;-)

#! /usr/bin/env python
from __future__ import print_function

d = """
/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up
"""


def service_state_parser_gen(text_lines):
    """Parse the lines from service monitor by splitting
    on well known binary condition (either up or down)
    and parse the rest of the fields based on fixed
    position split on sanitized data (in the up case).
    yield tuple of key and dictionary as result or of
    None, None when neihter up nor down detected."""

    token_up = ': up '
    token_down = ': down '
    path_sep = '/'

    for line in d.split('\n'):
        if token_up in line:
            chunks = line.split(token_up)
            status = token_up.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            _, pid, seconds, _ = chunks[1].replace(
                '(', '').replace(')', '').split()
            yield service, {'name': service,
                            'status': status,
                            'pid': int(pid),
                            'seconds': int(seconds)}
        elif token_down in line:
            chunks = line.split(token_down)
            status = token_down.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            pid = None
            seconds, _, _, _ = chunks[1].split()
            yield service, {'name': service,
                            'status': status,
                            'pid': None,
                            'seconds': int(seconds)}
        else:
            yield None, None


def main():
    """Sample driver for parser generator function."""

    services = {}
    for key, status_map in service_state_parser_gen(d):
        if key is None:
            print("Non-Status line ignored.")
        else:
            services[key] = status_map

    print(services)

if __name__ == '__main__':
    main()

When being run it yields as result on the given sample input: 在运行时,它在给定的样本输入上产生结果:

Non-Status line ignored.
Non-Status line ignored.
{'worker-test-1': {'status': 'up', 'seconds': 97381, 'pid': 1234, 'name': 'worker-test-1'}, 'worker-test-3': {'status': 'up', 'seconds': 97381, 'pid': 8910, 'name': 'worker-test-3'}, 'worker-test-2': {'status': 'up', 'seconds': 92233, 'pid': 4567, 'name': 'worker-test-2'}, 'worker-test-5': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-5'}, 'worker-test-4': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-4'}, 'worker-test-6': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-6'}}

So the otherwise in named group matches stored info is stored (already type converted as values under matching keys in a dict. If a service is down, there is of course no process id, thus pid is mapped to None which makes it easy to code in a robust manner against it (if one would store all down services in a separate structure that would be implicit, that no access to a pid field is advisable ... 因此,否则将在已命名的组中以匹配的方式存储已存储的信息(已经将类型转换为dict中匹配键下的值。如果服务关闭,则当然没有进程ID,因此pid被映射为None ,这使得编写代码变得容易以一种健壮的方式来解决它(如果将所有向下服务存储在一个隐式的单独结构中,则建议不要访问pid字段...

Hope it helps. 希望能帮助到你。 PS: Yes, the argument name text_lines of the showcase function is not optimally named, for what it contains, but you should get the parsing idea. PS:是的,由于其包含的内容,对于展示函数的参数名称text_lines并不是最佳名称,但是您应该了解解析的想法。

I don't know if you are forced to use regex at all but if you don't have to, you can do something like this: 我不知道您是否被迫使用正则表达式,但是如果您不必这样做,则可以执行以下操作:

if "down" in linetext:
    print( "is down" )
else:
    print( "is up" )

Easier to read and faster as well. 更容易阅读,也更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM