[英]Python - Understanding Regular Expression
So, I'm taking a list of usernames from a Linux server at school, this top code opens the directory where they are kept and saves it as information所以,我从学校的 Linux 服务器上获取用户名列表,上面的代码打开保存它们的目录并将其保存为信息
#!/usr/bin/env python
import subprocess, sys
r = subprocess.Popen(['ls','/home/ADILSTU'], stdout=subprocess.PIPE)
information = r.stdout.read()
print information, str(information)
that works just fine and list the users like this... where it list them 1 per line.这工作得很好,并像这样列出用户......它每行列出 1 个用户。 (there is atleast 100 usernames)
(至少有 100 个用户名)
ajax2
jjape3
jaxe32
my problem is, I want to create a "look-up" for these usernames, this is my code to search for usernames that only start with the letter j (so should only list jaxe32 from this list)我的问题是,我想为这些用户名创建一个“查找”,这是我搜索仅以字母 j 开头的用户名的代码(所以应该只从这个列表中列出 jaxe32)
#lookup
import re
p = re.compile(r'j(?!j)\w*')
print p.match(str(information)).group()
but when I run this I get this error, and if I get rid of .group() it then just states "none", but no error.但是当我运行它时,我得到了这个错误,如果我摆脱了 .group() ,那么它只会指出“无”,但没有错误。 So i'm not sure if the list is getting saved to a string right, or if I'm just missing something obvious.
所以我不确定列表是否正确保存到字符串中,或者我是否只是遗漏了一些明显的东西。 I only want to use regular expression for this, not anything else.
我只想为此使用正则表达式,而不是其他任何东西。
Traceback (most recent call last):
File "getInformation.py", line 11, in <module>
print p.match(str(information)).group()
AttributeError: 'NoneType' object has no attribute 'group'
From the documentation on re.match
:从关于
re.match
的文档:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.
如果 string 开头的零个或多个字符与正则表达式模式匹配,则返回相应的匹配对象。 Return None if the string does not match the pattern;
如果字符串与模式不匹配,则返回 None;
re.match
is only useful if the match starts from the beginning of the string, it does not find all matches in a string. re.match
仅当匹配从字符串的开头开始时才有用,它不会在字符串中找到所有匹配项。
This leaves you with two main options:这给您留下了两个主要选择:
Split the input file by line and use re.match
按行拆分输入文件并使用
re.match
Use multiline matching and re.findall
使用多行匹配和
re.findall
Option 1 :选项 1 :
r = subprocess.Popen(['ls', '/home/administrator/sotest'], stdout=subprocess.PIPE)
information = r.stdout.read().decode('utf-8').split('\n') # ['ajax2', 'jaxe32', 'jjape3', '']
for user in information:
s = re.match(r'j(?!j)\w*', user)
if s:
print(s.group())
Output:输出:
jaxe32
Option 2 (using (?m)^j(?!j)\\w*$
):选项 2 (使用
(?m)^j(?!j)\\w*$
):
r = subprocess.Popen(['ls', '/home/administrator/sotest'], stdout=subprocess.PIPE)
information = r.stdout.read().decode('utf-8') # 'ajax2\njaxe32\njjape3\n'
print(re.findall(r'(?m)^j(?!j)\w*$', information))
Output:输出:
['jaxe32']
The problem is that when the match
method doesn't match anything, it doesn't return an empty match
object, on which you could call the group
method, it returns None
.问题是,当
match
方法不匹配任何内容时,它不会返回一个空的match
对象,您可以在该对象上调用group
方法,它返回None
。 Which does not have the group
method.其中没有
group
方法。 Just check for None
before you call any methods.在调用任何方法之前,只需检查
None
。
#lookup
import re
p = re.compile(r'j(?!j)\w*')
result = p.match(str(information))
if result:
print result.group()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.