简体   繁体   English

正则表达式捕获'/ etc / services'

[英]Regex to capture '/etc/services'

I want to capture some info from the \\etc\\services file on my UNIX machine, But I capture the wrong value, while also overcomplicating it I think. 我想从我的UNIX机器上的\\etc\\services文件中捕获一些信息,但是我捕获了错误的值,同时我也认为它过于复杂。

What I have now 我现在有什么

with open('/etc/services') as ports_file:
    lines = ports_file.readlines()
    for line in lines:
        print re.findall('((\w*\-*\w+)+\W+(\d+)\/(tcp|udp))', line)

But it is yielding incorrect values like this: 但它产生的错误值如下:

[('dircproxy\t57000/tcp', 'dircproxy', '57000', 'tcp')]
[('tfido\t\t60177/tcp', 'tfido', '60177', 'tcp')]
[('fido\t\t60179/tcp', 'fido', '60179', 'tcp')]

I would want it like this: 我想要这样:

[('dircproxy', '57000', 'tcp')]
[('tfido', '60177', 'tcp')]
[('fido', '60179', 'tcp')]

I think this (\\w*\\-*\\w+)+ is needed in my regex because some are defined like this this-should-capture 我认为这个(\\w*\\-*\\w+)+在我的正则表达式中是必需的,因为有些像这样被定义为this-should-capture

I'd suggest coming at this from a different perspective: Instead of matching the field values, match the separators between them. 我建议从不同的角度来看这个:不是匹配字段值,而是匹配它们之间的分隔符。

print re.split(r'[\s/]+', line.split('#', 1)[0])[:3]

The first line.split('#', 1)[0] removes comments (anything after the first # in the file). 第一行line.split('#', 1)[0]删除注释(文件中第一个#之后的任何内容)。

It personally wouldn't use regex here. 它个人不会在这里使用正则表达式。 Look at the solution below and try to see if it fits your needs (also note that you can iterate over the file object directly): 查看下面的解决方案并尝试查看它是否符合您的需求(另请注意,您可以直接迭代文件对象):

services = []
with open('/etc/services') as serv:
    for line in serv:
        l = line.split()
        if len(l) < 2:
            continue
        if '/tcp' in l[1] or '/udp' in l[1]:
            port, protocol = l[1].split('/')
            services.append((l[0], port, protocol))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM