简体   繁体   中英

Is there a single Python regex that can change all “foo” to “bar” on lines starting with “#”?

Is it possible to write a single Python regular expression that can be applied to a multi-line string and change all occurrences of "foo" to "bar", but only on lines beginning with "#"?

I was able to get this working in Perl, using Perl's \\G regular expression sigil, which matches the end of the previous match. However, Python doesn't appear to support this.

Here's the Perl solution, in case it helps:

my $x =<<EOF;
# foo
foo
# foo foo
EOF

$x =~ s{
        (            # begin capture
          (?:\G|^\#) # last match or start of string plus hash
          .*?        # followed by anything, non-greedily
        )            # end capture
        foo
      }
      {$1bar}xmg;

print $x;

The proper output, of course, is:

# bar
foo
# bar bar

Can this be done in Python?


Edit: Yes, I know that it's possible to split the string into individual lines and test each line and then decide whether to apply the transformation, but please take my word that doing so would be non-trivial in this case. I really do need to do it with a single regular expression.

lines = mystring.split('\n')
for line in lines:
    if line.startswith('#'):
        line = line.replace('foo', 'bar')

No need for a regex.

It looked pretty easy to do with a regular expression:

>>> import re
... text = """line 1
... line 2
... Barney Rubble Cutherbert Dribble and foo
... line 4
... # Flobalob, bing, bong, foo and brian
... line 6"""
>>> regexp = re.compile('^(#.+)foo', re.MULTILINE)
>>> print re.sub(regexp, '\g<1>bar', text)
line 1
line 2
Barney Rubble Cutherbert Dribble and foo
line 4
# Flobalob, bing, bong, bar and brian
line 6

But then trying your example text is not so good:

>>> text = """# foo
... foo
... # foo foo"""
>>> regexp = re.compile('^(#.+)foo', re.MULTILINE)
>>> print re.sub(regexp, '\g<1>bar', text)
# bar
foo
# foo bar

So, try this:

>>> regexp = re.compile('(^#|\g.+)foo', re.MULTILINE)
>>> print re.sub(regexp, '\g<1>bar', text)
# foo
foo
# foo foo

That seemed to work, but I can't find \\g in the documentation !

Moral: don't try to code after a couple of beers.

\\g works in python just like perl, and is in the docs .

"In addition to character escapes and backreferences as described above, \\g will use the substring matched by the group named name, as defined by the (?P...) syntax. \\g uses the corresponding group number; \\g<2> is therefore equivalent to \\2, but isn't ambiguous in a replacement such as \\g<2>0. \\20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'. The backreference \\g<0> substitutes in the entire substring matched by the RE."

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM