Python - 理解正则表达式

Question

So, I'm taking a list of usernames from a Linux server at school, this top code opens the directory where they are kept and saves it as information所以，我从学校的 Linux 服务器上获取用户名列表，上面的代码打开保存它们的目录并将其保存为信息

#!/usr/bin/env python
import subprocess, sys

r = subprocess.Popen(['ls','/home/ADILSTU'], stdout=subprocess.PIPE)
information = r.stdout.read()
print information, str(information)

that works just fine and list the users like this... where it list them 1 per line.这工作得很好，并像这样列出用户......它每行列出 1 个用户。 (there is atleast 100 usernames) （至少有 100 个用户名）

ajax2
jjape3
jaxe32

my problem is, I want to create a "look-up" for these usernames, this is my code to search for usernames that only start with the letter j (so should only list jaxe32 from this list)我的问题是，我想为这些用户名创建一个“查找”，这是我搜索仅以字母 j 开头的用户名的代码（所以应该只从这个列表中列出 jaxe32）

#lookup
import re
p = re.compile(r'j(?!j)\w*')
print p.match(str(information)).group()

but when I run this I get this error, and if I get rid of .group() it then just states "none", but no error.但是当我运行它时，我得到了这个错误，如果我摆脱了 .group() ，那么它只会指出“无”，但没有错误。 So i'm not sure if the list is getting saved to a string right, or if I'm just missing something obvious.所以我不确定列表是否正确保存到字符串中，或者我是否只是遗漏了一些明显的东西。 I only want to use regular expression for this, not anything else.我只想为此使用正则表达式，而不是其他任何东西。

    Traceback (most recent call last):
    File "getInformation.py", line 11, in <module>
    print p.match(str(information)).group()
    AttributeError: 'NoneType' object has no attribute 'group'

Answer 1

From the documentation on re.match :从关于re.match的文档：

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.如果 string 开头的零个或多个字符与正则表达式模式匹配，则返回相应的匹配对象。 Return None if the string does not match the pattern;如果字符串与模式不匹配，则返回 None；

re.match is only useful if the match starts from the beginning of the string, it does not find all matches in a string. re.match仅当匹配从字符串的开头开始时才有用，它不会在字符串中找到所有匹配项。

This leaves you with two main options:这给您留下了两个主要选择：

Split the input file by line and use re.match按行拆分输入文件并使用re.match
Use multiline matching and re.findall使用多行匹配和re.findall

Option 1 :选项 1 ：

r = subprocess.Popen(['ls', '/home/administrator/sotest'], stdout=subprocess.PIPE)
information = r.stdout.read().decode('utf-8').split('\n') # ['ajax2', 'jaxe32', 'jjape3', '']

for user in information:
    s = re.match(r'j(?!j)\w*', user)
    if s:
        print(s.group())

Output:输出：

jaxe32

Option 2 (using (?m)^j(?!j)\\w*$ ):选项 2 （使用(?m)^j(?!j)\\w*$ ）：

r = subprocess.Popen(['ls', '/home/administrator/sotest'], stdout=subprocess.PIPE)
information = r.stdout.read().decode('utf-8') # 'ajax2\njaxe32\njjape3\n'

print(re.findall(r'(?m)^j(?!j)\w*$', information))

Output:输出：

['jaxe32']

Answer 2

The problem is that when the match method doesn't match anything, it doesn't return an empty match object, on which you could call the group method, it returns None .问题是，当match方法不匹配任何内容时，它不会返回一个空的match对象，您可以在该对象上调用group方法，它返回None 。 Which does not have the group method.其中没有group方法。 Just check for None before you call any methods.在调用任何方法之前，只需检查None 。

#lookup
import re
p = re.compile(r'j(?!j)\w*')
result = p.match(str(information))
if result:
    print result.group()

Python - 理解正则表达式

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-05-07 18:25:06

解决方案2
1 2018-05-07 18:19:28

Python - 理解正则表达式

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-05-07 18:25:06

解决方案2 1 2018-05-07 18:19:28

解决方案1
2 已采纳 2018-05-07 18:25:06

解决方案2
1 2018-05-07 18:19:28