簡體   English   中英

在Python中合並正則表達式

[英]Combine regular expression in Python

我是正則表達式的新手。

我正在嘗試獲取svstat命令中已啟動或關閉的服務列表。

svstat的示例輸出:

/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up

所以,目前我需要2個正則表達式來過濾UP或DOWN的服務

UP的regex-1示例:

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s\(pid\s(?P<pid>\d+)\)\s(?P<seconds>\d+)

regex-1的輸出:

Match 1
status -> up
service_name -> worker-test-1
pid -> 1234
seconds -> 97381

Match 2
status -> up
service_name -> worker-test-2
pid -> 4567
seconds -> 92233

Match 3
status -> up
service_name -> worker-test-3
pid -> 8910
seconds -> 97381

DOWN的樣本regex-2

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)\s(?P<seconds>\d+)

regex-2的輸出

Match 1
status -> down
service_name -> worker-test-4
seconds -> 9

Match 2
status -> down
service_name -> worker-test-5
seconds -> 9

Match 3
status -> down
service_name -> worker-test-6
seconds -> 9

問題是,如何僅使用1個正則表達式同時獲得UP和DOWN?

順便說一下,我使用http://pythex.org/創建和測試了這些正則表達式。

您可以將pid包含在可選的非捕獲組中:

/etc/service/(?P<service_name>.+):\s(?P<status>up|down)(?:\s\(pid\s(?P<pid>\d+)\))?\s(?P<seconds>\d+)

如果服務關閉,這將導致pidNone 請參閱Regex101演示。

如這里所承諾的,我的午休方案(不想談論固定令牌拆分解析,但是在考慮僅OP知道的其余用例時可能會派上用場;-)

#! /usr/bin/env python
from __future__ import print_function

d = """
/etc/service/worker-test-1: up (pid 1234) 97381 seconds
/etc/service/worker-test-2: up (pid 4567) 92233 seconds
/etc/service/worker-test-3: up (pid 8910) 97381 seconds
/etc/service/worker-test-4: down 9 seconds, normally up
/etc/service/worker-test-5: down 9 seconds, normally up
/etc/service/worker-test-6: down 9 seconds, normally up
"""


def service_state_parser_gen(text_lines):
    """Parse the lines from service monitor by splitting
    on well known binary condition (either up or down)
    and parse the rest of the fields based on fixed
    position split on sanitized data (in the up case).
    yield tuple of key and dictionary as result or of
    None, None when neihter up nor down detected."""

    token_up = ': up '
    token_down = ': down '
    path_sep = '/'

    for line in d.split('\n'):
        if token_up in line:
            chunks = line.split(token_up)
            status = token_up.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            _, pid, seconds, _ = chunks[1].replace(
                '(', '').replace(')', '').split()
            yield service, {'name': service,
                            'status': status,
                            'pid': int(pid),
                            'seconds': int(seconds)}
        elif token_down in line:
            chunks = line.split(token_down)
            status = token_down.strip(': ')
            service = chunks[0].split(path_sep)[-1]
            pid = None
            seconds, _, _, _ = chunks[1].split()
            yield service, {'name': service,
                            'status': status,
                            'pid': None,
                            'seconds': int(seconds)}
        else:
            yield None, None


def main():
    """Sample driver for parser generator function."""

    services = {}
    for key, status_map in service_state_parser_gen(d):
        if key is None:
            print("Non-Status line ignored.")
        else:
            services[key] = status_map

    print(services)

if __name__ == '__main__':
    main()

在運行時,它在給定的樣本輸入上產生結果:

Non-Status line ignored.
Non-Status line ignored.
{'worker-test-1': {'status': 'up', 'seconds': 97381, 'pid': 1234, 'name': 'worker-test-1'}, 'worker-test-3': {'status': 'up', 'seconds': 97381, 'pid': 8910, 'name': 'worker-test-3'}, 'worker-test-2': {'status': 'up', 'seconds': 92233, 'pid': 4567, 'name': 'worker-test-2'}, 'worker-test-5': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-5'}, 'worker-test-4': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-4'}, 'worker-test-6': {'status': 'down', 'seconds': 9, 'pid': None, 'name': 'worker-test-6'}}

因此,否則將在已命名的組中以匹配的方式存儲已存儲的信息(已經將類型轉換為dict中匹配鍵下的值。如果服務關閉,則當然沒有進程ID,因此pid被映射為None ,這使得編寫代碼變得容易以一種健壯的方式來解決它(如果將所有向下服務存儲在一個隱式的單獨結構中,則建議不要訪問pid字段...

希望能幫助到你。 PS:是的,由於其包含的內容,對於展示函數的參數名稱text_lines並不是最佳名稱,但是您應該了解解析的想法。

我不知道您是否被迫使用正則表達式,但是如果您不必這樣做,則可以執行以下操作:

if "down" in linetext:
    print( "is down" )
else:
    print( "is up" )

更容易閱讀,也更快。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM