正则表达式捕获'/ etc / services'

Question

I want to capture some info from the \\etc\\services file on my UNIX machine, But I capture the wrong value, while also overcomplicating it I think. 我想从我的UNIX机器上的\\etc\\services文件中捕获一些信息，但是我捕获了错误的值，同时我也认为它过于复杂。

What I have now 我现在有什么

with open('/etc/services') as ports_file:
    lines = ports_file.readlines()
    for line in lines:
        print re.findall('((\w*\-*\w+)+\W+(\d+)\/(tcp|udp))', line)

But it is yielding incorrect values like this: 但它产生的错误值如下：

[('dircproxy\t57000/tcp', 'dircproxy', '57000', 'tcp')]
[('tfido\t\t60177/tcp', 'tfido', '60177', 'tcp')]
[('fido\t\t60179/tcp', 'fido', '60179', 'tcp')]

I would want it like this: 我想要这样：

[('dircproxy', '57000', 'tcp')]
[('tfido', '60177', 'tcp')]
[('fido', '60179', 'tcp')]

I think this (\\w*\\-*\\w+)+ is needed in my regex because some are defined like this this-should-capture 我认为这个(\\w*\\-*\\w+)+在我的正则表达式中是必需的，因为有些像这样被定义为this-should-capture

Answer 1

I'd suggest coming at this from a different perspective: Instead of matching the field values, match the separators between them. 我建议从不同的角度来看这个：不是匹配字段值，而是匹配它们之间的分隔符。

print re.split(r'[\s/]+', line.split('#', 1)[0])[:3]

The first line.split('#', 1)[0] removes comments (anything after the first # in the file). 第一行line.split('#', 1)[0]删除注释（文件中第一个#之后的任何内容）。

Answer 2

It personally wouldn't use regex here. 它个人不会在这里使用正则表达式。 Look at the solution below and try to see if it fits your needs (also note that you can iterate over the file object directly): 查看下面的解决方案并尝试查看它是否符合您的需求（另请注意，您可以直接迭代文件对象）：

services = []
with open('/etc/services') as serv:
    for line in serv:
        l = line.split()
        if len(l) < 2:
            continue
        if '/tcp' in l[1] or '/udp' in l[1]:
            port, protocol = l[1].split('/')
            services.append((l[0], port, protocol))

正则表达式捕获'/ etc / services'

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-10-12 19:57:59

解决方案2
0 2017-10-12 20:06:05

正则表达式捕获&#39;/ etc / services&#39;

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-10-12 19:57:59

解决方案2 0 2017-10-12 20:06:05

正则表达式捕获'/ etc / services'

解决方案1
1 已采纳 2017-10-12 19:57:59

解决方案2
0 2017-10-12 20:06:05