[英]Regex case-insensitive searches not matching the exact word
I'm using the following regex to search for 3 different string formats, concurrently.我正在使用以下正则表达式同时搜索 3 种不同的字符串格式。 Additionally, I'm using
re.IGNORECASE
to match upper and lower case strings.此外,我使用
re.IGNORECASE
来匹配大小写字符串。 However, when I perform a search (eg 'locality'), I'm able to get string matches for 'localit', 'locali', 'local' and so on and so forth.但是,当我执行搜索(例如,'locality')时,我可以获得'localit'、'locali'、'local' 等的字符串匹配项。 I want to match the exact word (eg. 'locality').
我想匹配确切的单词(例如'locality')。
Also, if there is white space between string characters (eg., 'l ocal i ty'
), I want to ignore it.另外,如果字符串字符之间有空格(例如
'l ocal i ty'
),我想忽略它。 I have not found a re
method that allows me to do that.我还没有找到允许我这样做的
re
方法。 I tried using re.ASCII
, but I get an error: "...ascii is invalid."我尝试使用
re.ASCII
,但出现错误:“...ascii 无效。” Any assistance is appreciated.任何帮助表示赞赏。
elif searchType =='2':
print " Directory to be searched: c:\Python27 "
directory = os.path.join("c:\\","Python27")
userstring = raw_input("Enter a string name to search: ")
userStrHEX = userstring.encode('hex')
userStrASCII = ' '.join(str(ord(char)) for char in userstring)
regex = re.compile(r"(%s|%s|%s)" % ( re.escape( userstring ), re.escape( userStrHEX ), re.escape( userStrASCII ))re.IGNORECASE)
for root,dirname, files in os.walk(directory):
for file in files:
if file.endswith(".log") or file.endswith(".txt"):
f=open(os.path.join(root, file))
for line in f.readlines():
#if userstring in line:
if regex.search(line):
print "file: " + os.path.join(root,file)
break
else:
#print "String NOT Found!"
break
f.close()
There is no such flag in re, so either: re中没有这样的标志,所以要么:
construct a regex with explicit whitespace-matching after every char:在每个字符后构造一个带有显式空格匹配的正则表达式:
r'\s*'.join(c for c in userStrASCII)
This works: myre.findall(line)
finds 'l Oc ALi ty'这有效:
myre.findall(line)
发现 'l Oc ALi ty'
or (if you only need to detect matches to the pattern, but not do anything further with the actual match text) use string.translate(,deleteChars)
to strip whitespace from the line before matching.或者(如果您只需要检测与模式的匹配,但不对实际匹配文本做任何进一步的操作)使用
string.translate(,deleteChars)
在匹配之前从行中去除空格。 eg do line.translate(None, ' \t\n\r').lower()
before you try to match.例如,在尝试匹配之前执行
line.translate(None, ' \t\n\r').lower()
。 (Keep a copy of the unsquelched line.) (保留未压制线路的副本。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.