简体   繁体   English

在Python行首匹配一个字符串

[英]match a string at the beginning of the line in Python

I want to search for certain pattern at the beginning of every line inside a text file. 我想在文本文件中每一行的开头搜索某些模式。

Here is the contents of text file: 这是文本文件的内容:

module abc ( A, B, C, NSUP, PSUP, SEL );
input NSUP;
input PSUP;
input SEL;
inout A;
inout B;
output C;
//sample text input pins
//sample text output pins

I want output as 我想输出为

NSUP
PSUP
SEL
A
B
C

I tried below code but it prints an empty list as output: 我尝试了下面的代码,但它输出一个空列表作为输出:

fh=open("VamsModel","r")
contents=fh.read()
inoutPortList=re.compile(r'^(input|output|inout)\s+(\w+)')
matches = inoutPortList.finditer(contents)

for match in matches:
    print(match.group(2))

If I remove "^" from re.compile pattern then it works but then it wont look for patterns only at the begining. 如果我从re.compile模式中删除“ ^”,则它可以工作,但仅在开始时就不会寻找模式。

inoutPortList=re.compile(r'(input|output|inout)\s+(\w+)')

above regex will also output last two lines(shown below) from my text file which I don't want: 上面的正则表达式还将从我的文本文件中输出我不希望的最后两行(如下所示):

//sample text input pins
//sample text output pins

Any ideas why my regex not working when I am using "^" ?? 任何想法,为什么我在使用“ ^”时我的正则表达式不起作用?

With ^ on a whole buffer, you're looking for your expression at the start of the buffer. 在整个缓冲区上使用^ ,您将在缓冲区的开头查找表达式。

To look for the expression at the start of each line use multiline flag: 要在每行的开头查找表达式,请使用多行标志:

inoutPortList=re.compile(r'^(input|output|inout)\s+(\w+)',flags=re.M)

output: 输出:

NSUP
PSUP
SEL
A
B
C

Aside: with regex module, always pass the flags as keyword parameters: flags=re.M not just re.M . 另外:对于regex模块,始终将标志作为关键字参数传递: flags=re.M而不只是re.M It works with re.compile but not with re.sub because "count" parameter comes first, which creates weird issues. 它可以与re.compile但不能与re.sub因为“ count”参数排在最前面,这会产生奇怪的问题。

您需要使用re.MULTILINE(re.M速记)标志来指示^匹配行首而不是字符串:

inoutPortList=re.compile(r'^(input|output|inout)\s+(\w+)', re.M)

If you know what strings you're looking for you could use startswith in place of regular expressions 如果您知道要查找的字符串,则可以使用startswith代替正则表达式

if line.startswith(("input", "output", "inout")):
    print(line.split(" ", 1)[1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM