简体   繁体   English

在 python 正则表达式中的 Re.search 没有按预期工作

[英]Re.search in python regex not working as intended

QUESTION:题:

I am a beginner with python and using regex engine of python.我是python的初学者并使用python的正则表达式引擎。 I am able to match several sample regex patterns with my seed file but weirdly enough, I am unable to match lines in my sample file that contains the word "repeat."我能够将几个示例正则表达式模式与我的种子文件匹配,但奇怪的是,我无法匹配我的示例文件中包含“重复”一词的行。 Below is the context of my issue.以下是我的问题的上下文。 What could be the reason?可能是什么原因?

Sample Text:示例文本:

import tset flash_read, flash_writ;

vector ( $tset, (XMOSI,XMISO,XSGLK,XSTRMSTRT,XSTRMSGLK,XSTRMGKEN,XXTALIN,XXTALGPUEN,XHV (XSTRM03,XSTRMO2,XSTRM01,XSTRIADO,XNSS3,XNSS2,XNSS1,XNSSOH, XTEGLOGK, XRXDATA, XRXENABLE, XTXDATA, XTXENABLE, XNRESET, ROOK, XTMS, XTDI, XTDO, XNTRST))

> flash_writ .d0000 .dFF 1 01 01 01 01 X 1; // write byte 0

> flash_writ .d0001 .dFF 1 0 1 0 1 01 01 X 1; // write byte 1

repeat 25> flash_writ .d0000 .d00 1 1 1 0001  0 1 X 1; // wait program time 
> flash_writ .d0002 .dFF 1 0 1 0 1 0 1 0 1 X 1; // write byte 0

> flash_writ .d0003 .dFF 1 0 1 0 1 0 1 0 1 X 1; // write byte 1

repeat 25> flash_writ .d0000 .d00 1 1 1 0001  0 1 X 1; // wait program time 
> flash_writ .d0004 .dFF 1 01 01 01 01 X1, 11 write byte 0

> flash_writ .d0005 .dFF 1 01 01 01 01 X1; // write byte 1

repeat 25> flash_writ .d0000 .d00 1 1 1 0001  0 1 X 1; // wait program time

Python Syntax Used for regex search:用于正则表达式搜索的 Python 语法:

regex_rep = r” repeat "

for num, eachline in enumerate(files_atp):

if re.search(regex_rep, eachline, flags=re.IGNORECASE) is not None:

      print eachline

THIS IS NOT WORKING (does not produce any matches)这不起作用(不产生任何匹配)

Your pattern is:你的模式是:

regex_rep = r" repeat "

This will match the word repeat with a space on each end.这将匹配每端有一个空格的单词repeat

But your lines look like this:但你的线条看起来像这样:

repeat 25> flash_writ .d0000 .d00 1 1 1 0001  0 1 X 1; // wait program time

There's no space before repeat , so it doesn't match your pattern. repeat之前没有空格,因此它与您的模式不匹配。

It's hard to suggest how to fix this, because I'm not sure why you put those spaces in the pattern in the first place.很难建议如何解决这个问题,因为我不确定你为什么首先将这些空格放在模式中。


If they're there for no reason, just get rid of them:如果他们无缘无故地在那里,就摆脱他们:

regex_rep = r"repeat"

However, in that case, you're not using any features of re at all, so your test would be better written as:但是,在这种情况下,您根本没有使用re任何功能,因此您的测试最好写为:

if "repeat" in eachline:

If they're there to make the pattern more readable, and you want re to ignore the spaces, you can use the VERBOSE flag to tell it to ignore spaces in your pattern:如果它们在那里使模式更具可读性,并且您想re忽略空格,则可以使用VERBOSE标志告诉它忽略模式中的空格:

if re.search(regex_rep, eachline, flags=re.IGNORECASE|re.VERBOSE) is not None:

You can see this working at regex101 .您可以在 regex101 上看到此功能


If you want to make sure you're matching repeat as a whole word, rather than as part of a larger word like repeatable , you can use the \\b special character, which:如果您想确保将repeat作为一个完整的词进行匹配,而不是作为一个更大的词的一部分(如repeatable ,您可以使用\\b特殊字符,它:

Matches the empty string, but only at the beginning or end of a word…匹配空字符串,但仅在单词的开头或结尾...

regex_rep = r"\brepeat\b"

You can see this in action at regex101 .您可以在 regex101 中看到这一点

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM