a question about python regular expression.
I would like to match a div block like
<div class="leftTail"><ul class="hotnews">any news stuff</ul></div>
I was thinking a pattern like
p = re.compile(r'<div\s+class=\"leftTail\">[^(div)]+</div>')
but it seems not working properly
another pattern
p = re.compile(r'<div\s+class=\"leftTail\">[\W|\w]+</div>')
i got much more than i think, it gets all the stuff until the last tag in the file.
Thanks for any help
You might want to consider graduating to an actual HTML parser. I suggest you give Beautiful Soup a try. There are many crazy ways for HTML to be formatted, and the regular expressions may not work correctly all the time, even if you write them correctly.
Don't use regular expressions to parse XML or HTML. You'll never be able to get it to work correctly for nested divs.
尝试这个:
p = re.compile(r'<div\s+class=\"leftTail\">.*?</div>')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.