[英]regex to return all text between tags if text contains specific characters (Notepad++)
I am working on a large xml file with units of the following structure: 我正在使用以下结构的单元处理大型xml文件:
<TrU>
<CrD>16122013, 11:54:13
<CrU>IK
<ChD>16122013, 11:54:13
<ChU>IK
<Seg L=EN-GB>some text in English
<Seg L=RU-RU>some text in Russian
</TrU>
I need a regular expression that would find such complete structures only if between the tags <TrU>
and </TrU>
occurs any of the following characters: 我需要一个正则表达式,只有在标记<TrU>
和</TrU>
出现以下任何字符时才会找到这样的完整结构:
íèé IEE
The expression to find such structures without the specific character criterium is: <TrU>.*?</TrU>
找到没有特定字符标准的这种结构的表达式是: <TrU>.*?</TrU>
I modified it into: <TrU>.*?[íèé].*?</TrU>
我将其修改为: <TrU>.*?[íèé].*?</TrU>
but it is greedy and finds multiple, neighbourings units at a time usually only 1 of which contains one of the desired characters. 但它是贪婪的,并且一次找到多个邻居单位,通常只有一个包含所需字符之一。
Try That: 试试看:
<TrU>(?:(?!<TrU).)*?[íèé].*?<\/TrU>
Tested in notepad++ 在记事本++中测试
Make sure dot matches newline is checked 确保选中点匹配换行符
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.