简体   繁体   中英

Matching start of line in regex (^ returns empty list)

I am using Python's re module to extract some information from a .txt file.

My .txt file looks like this:

621345
21345[45]6213
421345[45]21345
21345[45]6213456
66456
21345[45]621345

I want to match the lines that begin with 21345 .

My code is as follows:

import re

pattern = re.compile('^21345.+')
filename = 'myfile.txt'


with open(filename, 'r') as f:
    found = re.findall(pattern, f.read())
    print(found)

This returns an empty list. It should return:

['21345[45]6213', '21345[45]6213456', '21345[45]621345']

I have tried matching just 21345 , which works. When I add the ^ , I start getting an empty list.

Your issue is that the ^ anchor matches the beginning of a string by default. file.read() reads your entire text file in one go, and the resulting string does not match your query (given the first line does not start with the defined sequence), hence the empty list. If you want to match the beginning of each line, set the re.MULTILINE flag when compiling your pattern, eg

pattern = re.compile('^21345.+', re.MULTILINE)

That will return the desired list

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM