splitting and escaped forward slashes in Python

Question

I have a file containing perl-style regexs of the form /pattern/replace/ that I'm attempting to read into Python as a list of compiled patterns and their associated replacement strings. Below is what I've done so far.

def get_regex(filename):
    regex = []
    fi = open(filename,'r')
    text = [l for l in fi.readlines() if not l.startswith("#")]
    fi.close()
    for line in text:
        ptn, repl = line[1:].split('/')[:-1]
        regex.append((re.compile(ptn), repl))
    return regex

This works perfectly well until I get to lines with escaped forward slashes, like this:

/$/ <\\/a>/

When I try to split this string, Python returns a list of three elements, ['$', ' <\\\\', 's>'] , rather than (the hoped for) ['$', ' <\\\\/s>'] . Is there some way to make replace interpret the escapes?

Answer 1

Not really, no. Your best bet would probably be to use re.split() instead, with a regex that uses a lookbehind to make sure a forward slash isn't escaped, eg

UNESCAPED_SLASH_RE = re.compile(r'(?<!\\)/')
ptn, repl = UNESCAPED_SLASH_RE.split(line[1:])[:-1]

splitting and escaped forward slashes in Python

Question

1 answers

solution1
3 ACCPTED 2011-10-10 20:20:23

splitting and escaped forward slashes in Python

Question

1 answers

solution1 3 ACCPTED 2011-10-10 20:20:23

solution1
3 ACCPTED 2011-10-10 20:20:23