简体   繁体   中英

How can i search and replace using python regex

I want to make the function which find for string in the array and then replace the corres[ponding element from the dictionary. so far i have tried this but i am not able to figure out few things like

  1. How can escape special characters
  2. I can i replace with match found. i tried \\1 but it didn't work

dsds

def myfunc(h):
        myarray = {
                "#":"\\#",
                "$":"\\$",
                "%":"\\%",
                "&":"\\&",
                "~":"\\~{}",
                "_":"\\_",
                "^":"\\^{}",
                "\\":"\\textbackslash{}",
                "{":"\\{",
                "}":"\\}"                
                    }
        pattern = "[#\$\%\&\~\_\^\\\\\{\}]"
        pattern_obj = re.compile(pattern, re.MULTILINE)
        new = re.sub(pattern_obj,myarray[\1],h)

        return new

You're looking for re.sub callbacks:

def myfunc(h):
    rules = {
            "#":r"\#",
            "$":r"\$",
            "%":r"\%",
            "&":r"\&",
            "~":r"\~{}",
            "_":r"\_",
            "^":r"\^{}",
            "\\":r"\textbackslash{}",
            "{":r"\{",
            "}":r"\}"                
    }
    pattern = '[%s]' % re.escape(''.join(rules.keys()))
    new = re.sub(pattern, lambda m: rules[m.group()], h)
    return new

This way you avoid 1) loops, 2) replacing already processed content.

You can try to use re.sub inside a loop that iterates over myarray.items(). However, you'll have to do backslash first since otherwise that might replace things incorrectly. You also need to make sure that "{" and "}" happen first, so that you don't mix up the matching. Since dictionaries are unordered I suggest you use list of tuples instead:

def myfunc(h):
    myarray = [
            ("\\","\\textbackslash")
            ("{","\\{"),
            ("}","\\}"),
            ("#","\\#"),
            ("$","\\$"),
            ("%","\\%"),
            ("&","\\&"),
            ("~","\\~{}"),
            ("_","\\_"),
            ("^","\\^{}")]

    for (val, replacement) in myarray:
        h = re.sub(val, replacement, h)
    h = re.sub("\\textbackslash", "\\textbackslash{}", h)

    return h
  1. I'd suggest you to use raw literal syntax ( r"" ) for better readability of the code.
  2. For the case of your array you may want just to use str.replace function instead of re.sub .
def myfunc(h):
    myarray = [
            ("\\", r"\textbackslash"),
            ("{", r"\{"),
            ("}", r"\}"),
            ("#", r"\#"),
            ("$", r"\$"),
            ("%", r"\%"),
            ("&", r"\&"),
            ("~", r"\~{}"),
            ("_", r"\_"),
            ("^", r"\^{}")]

    for (val, replacement) in myarray:
        h = h.replace(val, replacement)
    h = h.replace(r"\textbackslash", r"\textbackslash{}", h)

    return h

The code is a modification of @tigger's answer.

要转义元字符,请使用原始字符串和反斜杠

r"regexp with a \* in it"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM