[英]Match between lines while skipping pattern with Regex
I've been trying to match between lines while skipping a pattern. 我一直在尝试跳过模式时在行之间进行匹配。 I'm using the
re.DOTALL
regex flag. 我正在使用
re.DOTALL
regex标志。
What i need to extract is 我需要提取的是
CHINTHAPUDI<br/>
CHINTHAPUDI<br/>
from between Electors Name and Father's Name. 在选举人姓名和父亲姓名之间。
What i have currently mustered up is this regex: 我目前聚集的是这个正则表达式:
(?:^Elector\'s Name:.*?<br/>)(.*?)^(?:Husband|Father)
But it matches the other Elector's Name
lines beneath the first match. 但它会与第一个匹配项下方的其他
Elector's Name
行匹配。
Link to my regex101 链接到我的regex101
Here's the document from which i want to match: 这是我要匹配的文档:
Elector's Name: ANANTH CHINTAPUDI<br/>
Elector's Name: THIRUPATHI <br/>
Elector's Name: SRINIVASH <br/>
CHINTHAPUDI<br/>
CHINTHAPUDI<br/>
Father's Name: POSHANNA <br/>
Father's Name: SHANKAR <br/>
Father's Name: SHANKAR <br/>
CHINTAPUDDI<br/>
CHINTHAPUDI<br/>
CHINTHAPUDI<br/>
How could i go about matching from the last Elector's Name
till Father's Name
? 从最后一个
Elector's Name
到Father's Name
我该如何匹配?
Here's an option which works for your provided input: 这是一个适用于您提供的输入的选项:
(?:Elector\\'s Name:.*?<br/>\\r?\\n)+(.*?)(?:Husband|Father)
There is one potential issue that you should consider if you use this: If an Elector's Name
appears earlier in the document, the first set will be used. 如果使用此方法,则应考虑一个潜在的问题:如果
Elector's Name
出现在文档的前面,则将使用第一组。 See demo . 参见演示 。
Additionally, as your Regex attempt required that Elector's Name
and Husband
or Father
be at the beginning of the line, here's a version which maintains that requirement. 另外,由于您的正则表达式尝试要求在行首添加
Elector's Name
和Husband
或Father
,所以这里是一个保留该要求的版本。 If possible, I would avoid this as it results in a much slower (30x) check. 如果可能的话,我会避免这种情况,因为它会导致检查(30x)慢得多。
(?:\\r?\\nElector\\'s Name:.*?<br/>)+\\r?\\n(.*?)\\r?\\n(?=Husband|Father)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.