简体   繁体   中英

splitting and escaped forward slashes in Python

I have a file containing perl-style regexs of the form /pattern/replace/ that I'm attempting to read into Python as a list of compiled patterns and their associated replacement strings. Below is what I've done so far.

def get_regex(filename):
    regex = []
    fi = open(filename,'r')
    text = [l for l in fi.readlines() if not l.startswith("#")]
    fi.close()
    for line in text:
        ptn, repl = line[1:].split('/')[:-1]
        regex.append((re.compile(ptn), repl))
    return regex

This works perfectly well until I get to lines with escaped forward slashes, like this:

/$/ <\\/a>/

When I try to split this string, Python returns a list of three elements, ['$', ' <\\\\', 's>'] , rather than (the hoped for) ['$', ' <\\\\/s>'] . Is there some way to make replace interpret the escapes?

Not really, no. Your best bet would probably be to use re.split() instead, with a regex that uses a lookbehind to make sure a forward slash isn't escaped, eg

UNESCAPED_SLASH_RE = re.compile(r'(?<!\\)/')
ptn, repl = UNESCAPED_SLASH_RE.split(line[1:])[:-1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM