简体   繁体   中英

Match between lines while skipping pattern with Regex

I've been trying to match between lines while skipping a pattern. I'm using the re.DOTALL regex flag.

What i need to extract is

CHINTHAPUDI<br/>
CHINTHAPUDI<br/>

from between Electors Name and Father's Name.

What i have currently mustered up is this regex:

(?:^Elector\'s Name:.*?<br/>)(.*?)^(?:Husband|Father)

But it matches the other Elector's Name lines beneath the first match.

Link to my regex101

Here's the document from which i want to match:

Elector's Name: ANANTH CHINTAPUDI<br/>
Elector's Name: THIRUPATHI <br/>
Elector's Name: SRINIVASH <br/>
CHINTHAPUDI<br/>
CHINTHAPUDI<br/>
Father's Name: POSHANNA <br/>
Father's Name: SHANKAR <br/>
Father's Name: SHANKAR <br/>
CHINTAPUDDI<br/>
CHINTHAPUDI<br/>
CHINTHAPUDI<br/>

How could i go about matching from the last Elector's Name till Father's Name ?

Here's an option which works for your provided input:

(?:Elector\\'s Name:.*?<br/>\\r?\\n)+(.*?)(?:Husband|Father)

There is one potential issue that you should consider if you use this: If an Elector's Name appears earlier in the document, the first set will be used. See demo .

Additionally, as your Regex attempt required that Elector's Name and Husband or Father be at the beginning of the line, here's a version which maintains that requirement. If possible, I would avoid this as it results in a much slower (30x) check.

(?:\\r?\\nElector\\'s Name:.*?<br/>)+\\r?\\n(.*?)\\r?\\n(?=Husband|Father)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM