简体   繁体   English

多行图案和标签搜索

[英]Multi-line Pattern and tag search

I'm trying to make a pattern for tags, but the sub method just replaces the first char and 3 at the end of the line, im trying to replace all tags on the line and with multiline 我正在尝试为标签制作模式,但是sub方法只是替换了行末的第一个char和3,我试图用多行替换行中的所有标签

p=re.compile('<img=([^}]*)>([^}]*)</img>', re.S)
p.sub(r'[img=\1]\2[/img]','<img="test">dsad</img> <img="test2">dsad2</img>')
output:
'**[**img="test">dsad</img> <img="test2"]dsad2**[/img]**'

You're using towards the start of your re's pattern: 您正在使用re模式的开始:

<img=([^}]*)>

this will gobble up (as group 1) all characters after the leading <img= , including other tags!!! 这将吞噬(作为组1)前导<img=之后的所有字符, 包括其他标签!!! , up to the last > it can possibly gobble; ,直到最后>可能会吞噬; * is GREEDY -- it gobbles up as much as it possibly can. *是贪婪的-它会尽最大可能吞噬。 Not sure why you're specifically excluding closed-braces } ? 不确定为什么要专门排除右括号}吗? Maybe you meant to exclude closed angular brackets instead ( > ). 也许您是想排除封闭的尖括号而不是( > )。

For NON-greedy matching, instead of * , you need *? 对于非贪婪的匹配,而不是* ,你需要*? ; ; with that, you'll be gobbling up as little as you can, instead of as much as you can. 这样,您将尽力而为,而不是尽其所能。 So, I think you mean: 所以,我想你的意思是:

p = re.compile(r'<img=([^>]*?)>(.*?)</img>', re.S)

this matches one img tag (and all tags inside it), and appears to be performing exactly the substitutions you mean. 它与一个img标签(及其中的所有标签)匹配,并且似乎完全按照您的意思执行替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM