[英]multiline regex pattern for text patterns
I hav 100s of pages of the following type transcript:我有 100 页以下类型的成绩单:
<p><strong>ROGELIO JIMÉNEZ PONS:</strong> Quisiera
<p>Text here...</p>
<p><strong>PRESIDENTE ANDRÉS MANUEL LÓPEZ OBRADOR:</strong>
<p>Text here...</p>
<p>Text here...</p>
<p><strong>PREGUNTA:</strong>
<p>Text here...</p>
<p><strong>PRESIDENTE ANDRÉS MANUEL LÓPEZ OBRADOR:</strong>
<p>Text here...</p>
<p>Text here...</p>
<p>Text here...</p>
<p><strong>INTERLOCUTOR:</strong>
I want to capture and return just what the Obrador says:我想捕捉并返回奥夫拉多人所说的:
<p><strong>PRESIDENTE ANDRÉS MANUEL LÓPEZ OBRADOR:</strong>
<p>Text here...</p>
<p>Text here...</p>
<p><strong>PRESIDENTE ANDRÉS MANUEL LÓPEZ OBRADOR:</strong>
<p>Text here...</p>
<p>Text here...</p>
<p>Text here...</p>
I get close with this regex:我接近这个正则表达式:
<p><strong>PRESIDENTE(.*)\n(.*)?\n?(.*)?\n?(.*)
But not quite right since I can't seem to work out the end of the pattern which should end with但不太正确,因为我似乎无法计算出应该以
<p><strong>[ANYTHING NOT PRESIDENTE]
OBRADOR:<\/strong>\r?\n((?:(?!<p><strong>)^[^\r\n]+\r?\n)+)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.