简体   繁体   中英

Python's Re.Sub Changes Nothing When Regex Matches

Regex confuses me massively and I'm trying to get Python (3.4.0) to replace multiple lines of MD/CSS code (between two lines marking the start and end of the "table" segment) in two instances for a reddit bot. It doesn't work in either case and I've tried multiple different regexes for this. I've also attempted to make it a raw string and escaping more characters (albeit, not tried many combinations), as suggested in some other SO threads. Annoyingly, the regex matches fine on both regex101.com (on both php and python flavours) and on Pythex.org. Just doesn't work in Python.

This is the relevant bit of code, both doing more or less the same thing.

sidebar = r.get_settings(sub)["description"]
regex = r'(?<=\[\]\(#STARTTABLE\)\\n).*?(?=\\n\[\]\(#ENDTABLE\)|$)'
sidebar = re.sub(regex,md,sidebar)
r.update_settings(r.get_subreddit(sub),description=sidebar)


stylesheet = r.get_stylesheet(sub)["stylesheet"]
regex = r'(?<=\/\*START TABLE\*\/).*?(?=\/\*END TABLE\*\/|$)'
stylesheet = re.sub(regex,css, stylesheet)
r.set_stylesheet(sub,stylesheet)

I've uploaded the various variables to pastebin. The sidebar string is available here , md here , stylesheet here and css here .

Many thanks for your help.

I fixed your regex by compiling it with flag re.DOTALL , to make . match newline. I also removed escaping from \\n . Here's modified regular expression:

regex = re.compile(r'(?<=\[\]\(#STARTTABLE\)\n).*?(?=\n\[\]\(#ENDTABLE\)|$)', re.S)
sidebar = regex.sub(md, sidebar)

But, if pattern occurs in content only once, I wouldn't bother with so complicates regexes, I'd use str.split() method instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM