简体   繁体   English

正则表达式,如果文本包含特定字符,则返回标记之间的所有文本(Notepad ++)

[英]regex to return all text between tags if text contains specific characters (Notepad++)

I am working on a large xml file with units of the following structure: 我正在使用以下结构的单元处理大型xml文件:

<TrU>
<CrD>16122013, 11:54:13
<CrU>IK
<ChD>16122013, 11:54:13
<ChU>IK
<Seg L=EN-GB>some text in English
<Seg L=RU-RU>some text in Russian
</TrU>

I need a regular expression that would find such complete structures only if between the tags <TrU> and </TrU> occurs any of the following characters: 我需要一个正则表达式,只有在标记<TrU></TrU>出现以下任何字符时才会找到这样的完整结构:

íèé IEE

The expression to find such structures without the specific character criterium is: <TrU>.*?</TrU> 找到没有特定字符标准的这种结构的表达式是: <TrU>.*?</TrU>

I modified it into: <TrU>.*?[íèé].*?</TrU> 我将其修改为: <TrU>.*?[íèé].*?</TrU>

but it is greedy and finds multiple, neighbourings units at a time usually only 1 of which contains one of the desired characters. 但它是贪婪的,并且一次找到多个邻居单位,通常只有一个包含所需字符之一。

Try That: 试试看:

<TrU>(?:(?!<TrU).)*?[íèé].*?<\/TrU>

Tested in notepad++ 在记事本++中测试

Make sure dot matches newline is checked 确保选中点匹配换行符

Explanation 说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM