简体   繁体   English

使用正则表达式查找与html标签最匹配的模式

[英]Regular expression to find the closest pattern match to an html tag

I am checking for a solution to the following problem. 我正在检查以下问题的解决方案。 I have a text sequence as follows and I would like to extract the contents of the square brackets which is closer to the <em> tag. 我有一个文本序列,如下所示,我想提取方括号中靠近<em>标签的内容。

[P1/1]0(4)0(5)**[P1/432]** g(5)I(2)d(7)a(8)`<em>`b(5)[P1/4]C(6)e(7)B(8)B`</em>`(9)[P1/5]0(6)i(7)[P1/6]0(1)I(2)[P1/7]0(6)[P1/1]0(1)0(2)[P1/2]E(1)c(2)d(3)a(4)**[P1/3]** 0(1)`<em>`b(2)[P1/4]C(1)e(2)B(3)B`</em>`(4)[P1/5]0(1)

So in the above mentioned text, what I am searching for is [P1/432] and [P1/3] . 因此,在上述文本中,我要搜索的是[P1 / 432][P1 / 3]

With regular expression ((.(?!\\[.*?]))+?)<em> , I am not able to get only the contents of the brackets, but everything from [ to <em> . 使用正则表达式((.(?!\\[.*?]))+?)<em> ,我无法仅获取方括号的内容,而无法获取从[到<em>

Can someone help me ?? 有人能帮我吗 ??

There is a straightforward solution if we don't care about nested, unbalanced brackets: 如果我们不关心嵌套的,不平衡的括号,有一个简单的解决方案:

\[[^\]\[]*\](?=[^\]\[]*<em>)

Live demo 现场演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM