简体   繁体   中英

regex to return all text between tags if text contains specific characters (Notepad++)

I am working on a large xml file with units of the following structure:

<TrU>
<CrD>16122013, 11:54:13
<CrU>IK
<ChD>16122013, 11:54:13
<ChU>IK
<Seg L=EN-GB>some text in English
<Seg L=RU-RU>some text in Russian
</TrU>

I need a regular expression that would find such complete structures only if between the tags <TrU> and </TrU> occurs any of the following characters:

íèé

The expression to find such structures without the specific character criterium is: <TrU>.*?</TrU>

I modified it into: <TrU>.*?[íèé].*?</TrU>

but it is greedy and finds multiple, neighbourings units at a time usually only 1 of which contains one of the desired characters.

Try That:

<TrU>(?:(?!<TrU).)*?[íèé].*?<\/TrU>

Tested in notepad++

Make sure dot matches newline is checked

Explanation

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM