正则表达式不区分大小写的搜索与确切的单词不匹配

Question

I'm using the following regex to search for 3 different string formats, concurrently.我正在使用以下正则表达式同时搜索 3 种不同的字符串格式。 Additionally, I'm using re.IGNORECASE to match upper and lower case strings.此外，我使用re.IGNORECASE来匹配大小写字符串。 However, when I perform a search (eg 'locality'), I'm able to get string matches for 'localit', 'locali', 'local' and so on and so forth.但是，当我执行搜索（例如，'locality'）时，我可以获得'localit'、'locali'、'local' 等的字符串匹配项。 I want to match the exact word (eg. 'locality').我想匹配确切的单词（例如'locality'）。

Also, if there is white space between string characters (eg., 'l ocal i ty' ), I want to ignore it.另外，如果字符串字符之间有空格（例如'l ocal i ty' ），我想忽略它。 I have not found a re method that allows me to do that.我还没有找到允许我这样做的re方法。 I tried using re.ASCII , but I get an error: "...ascii is invalid."我尝试使用re.ASCII ，但出现错误：“...ascii 无效。” Any assistance is appreciated.任何帮助表示赞赏。

elif searchType =='2':
  print "  Directory to be searched: c:\Python27 "
  directory = os.path.join("c:\\","Python27")
  userstring = raw_input("Enter a string name to search: ")
  userStrHEX = userstring.encode('hex')
  userStrASCII = ' '.join(str(ord(char)) for char in userstring)
  regex = re.compile(r"(%s|%s|%s)" % ( re.escape( userstring ), re.escape( userStrHEX ), re.escape( userStrASCII ))re.IGNORECASE)
  for root,dirname, files in os.walk(directory):
     for file in files:
         if file.endswith(".log") or file.endswith(".txt"):
            f=open(os.path.join(root, file))
            for line in f.readlines():
               #if userstring in line:
               if regex.search(line):       
                  print "file: " + os.path.join(root,file)           
                  break
            else:
               #print "String NOT Found!"
               break
            f.close()

Answer 1

There is no such flag in re, so either: re中没有这样的标志，所以要么：

construct a regex with explicit whitespace-matching after every char:在每个字符后构造一个带有显式空格匹配的正则表达式：
r'\s*'.join(c for c in userStrASCII)
This works: myre.findall(line) finds 'l Oc ALi ty'这有效： myre.findall(line)发现 'l Oc ALi ty'
or (if you only need to detect matches to the pattern, but not do anything further with the actual match text) use string.translate(,deleteChars) to strip whitespace from the line before matching.或者（如果您只需要检测与模式的匹配，但不对实际匹配文本做任何进一步的操作）使用string.translate(,deleteChars)在匹配之前从行中去除空格。 eg do line.translate(None, ' \t\n\r').lower() before you try to match.例如，在尝试匹配之前执行line.translate(None, ' \t\n\r').lower() 。 (Keep a copy of the unsquelched line.) （保留未压制线路的副本。）

正则表达式不区分大小写的搜索与确切的单词不匹配

问题描述

1 个解决方案

解决方案1
2 2011-07-04 21:16:32

正则表达式不区分大小写的搜索与确切的单词不匹配

问题描述

1 个解决方案

解决方案1 2 2011-07-04 21:16:32

解决方案1
2 2011-07-04 21:16:32