简体   繁体   中英

Python Regex: escaping a back-reference

Here is my situation:

re.sub(r'([^\\])', r'\1[\W\1]*', string)

It is straight forward that I want to append [\\W(itself)] after (itself) for itself being a group of characters (can be special). That is why I need to put it in a set to strip away all special meanings. However, my group can be a SET. I know that nested sets do not work. How do I escape / remove the square brackets to safely put my group in the set?

My other attempt was to use \\1(\\W|\\1)* instead, but I need to escape characters in my group without escaping possible square brackets in the group. How do I do so?

This is a dilemma. I do not know how to solve this problem and which way to go. Please help.

Thank you very much.

EDIT: I skipped a step. After matching a character but \\ (the [^\\] part) and replace with the explained expression, I will sometime need to replace it with a set of similar characters. So, 'a' becomes '[a@]', 's' becomes '[s5$]', etc... The question was really wrong. But I solved the problem, so if you are still trying to make some sense out of what I wrote earlier, please don't :)

You can use a function as the replacement in re.sub . This will allow you to call re.escape on your match before performing the substitution:

def escape_repl(match):
    return '{0}[\W{0}]*'.format(re.escape(match.group(1)))

re.sub(r'([^\\])', escape_repl, string)

Example:

>>> print re.sub(r'([^\\])', escape_repl, '[^$]')
\[[\W\[]*\^[\W\^]*\$[\W\$]*\][\W\]]*

I think this is what you are trying to do, but it is a little unclear from your question. Please provide some sample strings and expected results if this isn't what you're looking for.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM