简体   繁体   English

Python - 理解正则表达式

[英]Python - Understanding Regular Expression

So, I'm taking a list of usernames from a Linux server at school, this top code opens the directory where they are kept and saves it as information所以,我从学校的 Linux 服务器上获取用户名列表,上面的代码打开保存它们的目录并将其保存为信息

#!/usr/bin/env python
import subprocess, sys

r = subprocess.Popen(['ls','/home/ADILSTU'], stdout=subprocess.PIPE)
information = r.stdout.read()
print information, str(information)

that works just fine and list the users like this... where it list them 1 per line.这工作得很好,并像这样列出用户......它每行列出 1 个用户。 (there is atleast 100 usernames) (至少有 100 个用户名)

ajax2
jjape3
jaxe32    

my problem is, I want to create a "look-up" for these usernames, this is my code to search for usernames that only start with the letter j (so should only list jaxe32 from this list)我的问题是,我想为这些用户名创建一个“查找”,这是我搜索仅以字母 j 开头的用户名的代码(所以应该只从这个列表中列出 jaxe32)

#lookup
import re
p = re.compile(r'j(?!j)\w*')
print p.match(str(information)).group()

but when I run this I get this error, and if I get rid of .group() it then just states "none", but no error.但是当我运行它时,我得到了这个错误,如果我摆脱了 .group() ,那么它只会指出“无”,但没有错误。 So i'm not sure if the list is getting saved to a string right, or if I'm just missing something obvious.所以我不确定列表是否正确保存到字符串中,或​​者我是否只是遗漏了一些明显的东西。 I only want to use regular expression for this, not anything else.我只想为此使用正则表达式,而不是其他任何东西。

    Traceback (most recent call last):
    File "getInformation.py", line 11, in <module>
    print p.match(str(information)).group()
    AttributeError: 'NoneType' object has no attribute 'group'

From the documentation on re.match :从关于re.match的文档:

If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.如果 string 开头的零个或多个字符与正则表达式模式匹配,则返回相应的匹配对象。 Return None if the string does not match the pattern;如果字符串与模式不匹配,则返回 None;

re.match is only useful if the match starts from the beginning of the string, it does not find all matches in a string. re.match仅当匹配从字符串的开头开始时才有用,它不会在字符串中找到所有匹配项。

This leaves you with two main options:这给您留下了两个主要选择:

  • Split the input file by line and use re.match按行拆分输入文件并使用re.match

  • Use multiline matching and re.findall使用多行匹配和re.findall

Option 1 :选项 1

r = subprocess.Popen(['ls', '/home/administrator/sotest'], stdout=subprocess.PIPE)
information = r.stdout.read().decode('utf-8').split('\n') # ['ajax2', 'jaxe32', 'jjape3', '']

for user in information:
    s = re.match(r'j(?!j)\w*', user)
    if s:
        print(s.group())

Output:输出:

jaxe32

Option 2 (using (?m)^j(?!j)\\w*$ ):选项 2 (使用(?m)^j(?!j)\\w*$ ):

r = subprocess.Popen(['ls', '/home/administrator/sotest'], stdout=subprocess.PIPE)
information = r.stdout.read().decode('utf-8') # 'ajax2\njaxe32\njjape3\n'

print(re.findall(r'(?m)^j(?!j)\w*$', information))

Output:输出:

['jaxe32']

The problem is that when the match method doesn't match anything, it doesn't return an empty match object, on which you could call the group method, it returns None .问题是,当match方法不匹配任何内容时,它不会返回一个空的match对象,您可以在该对象上调用group方法,它返回None Which does not have the group method.其中没有group方法。 Just check for None before you call any methods.在调用任何方法之前,只需检查None

#lookup
import re
p = re.compile(r'j(?!j)\w*')
result = p.match(str(information))
if result:
    print result.group()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM