繁体   English   中英

在python中为re.sub请求一个正则表达式

[英]Ask for a regex for re.sub in python

我有一些像这样的字符串,在=之前或之后将有0或多个空格,并且在字符串的末尾将有0或1 ### comment

log_File = a.log   ### the path for log
log_level = 10 

现在,我想替换=右边的字符串。 例如,将它们设置为如下所示:

log_File = b.log   ### the path for log
log_level = 40 

import re
s="log_File = a.log   ### the path for log"
re.sub("(?<=\s)\w+\S+",'Hello",s)

上面的代码将=之后的所有字符串替换为Hello,我不想替换###之后的字符串,该如何实现。

尝试以下代码:

>>> re.sub(r'(?<!#)=(.*?)(?=\s*#|$)', r'= Hello', s, 1)
'log_File = Hello   ### the path for log'

不使用正则表达式(已修改Inbar Rose的版本)

def replace_value(s, new):
    content, sep1, comment = s.partition('#')
    key, sep2, value = content.partition('=')
    if sep2: content = key + sep2 + new
    return content + sep1 + comment

assert replace_value('log_File = b', ' Hello') == 'log_File = Hello'
assert replace_value('#log_File = b', ' Hello') == '#log_File = b'
assert replace_value('#This is comment', ' Hello') == '#This is comment'
assert replace_value('log_File = b # hello', ' Hello') == 'log_File = Hello# hello'

我看不出问题出在哪里。

下面的代码呢?

import re

pat = '(=\s*).+?(?=\s*(#|$))'
rgx = re.compile(pat,re.MULTILINE)

su = '''log_File = a.log   ### the path for log   
log_File = a.log   
log_File = a.log'''

print su
print
print rgx.sub('\\1Hello',su)

编辑

我已经看到了问题所在!

正如我写的那样,我认为仅靠正则表达式或相对简单的函数就无法解决问题,因为在不触及分配的情况下更改分配的右侧部分(分配的AST节点中称为value的属性)的问题可能的注释需要进行语法分析,以确定什么是分配的左侧部分(分配的AST节点中称为目标的属性),什么是右侧以及一行中可能的注释是什么。 即使一行不是分配指令,也需要通过语法分析来确定它。

对于这样的任务,只有模块ast ,它可以帮助Python应用程序处理Python抽象语法语法的树 ,我认为它可以提供实现目标的工具。

这是我根据这个想法成功编写的代码:

import re,ast       
from sys import exit

su = '''# it's nothing
import re
def funcg(a,b):\r
    print a*b + 900
x = "abc#ghi"\t\t# comment
k = 103
dico["abc#12"] = [(x,x//3==0) for x in xrange(25) if x !=12]
dico["ABC#12"] = 45   # comment
a = 'lulu#88'
dico["mu=$*"] = 'mouth#30'  #ohoh
log_File = a.log
y = b.log ### x = a.log  
'''

print su

def subst_assign_val_in_line(line,b0,repl):
    assert(isinstance(b0,ast.AST))
    coloffset = b0.value.col_offset
    VA = line[coloffset:]
    try:
        yy = compile(VA+'\n',"-expr-",'eval')
    except: # because of a bug of ast in computing VA
        coloffset = coloffset - 1
        VA = line[coloffset:]
        yy = compile(VA+'\n',"-expr-",'eval')

    gen = ((i,c) for i,c in enumerate(VA) if c=='#')
    for i,c in gen:
        VAshort = VA[0:i] # <== cuts in front of a # character
        try:
            yyi = compile(VAshort+'\n',"-exprshort-",'eval')
        except:
            pass
        else:
            if yy==yyi:
                return (line[0:coloffset] + repl + ' ' +
                        line[coloffset+i:])
                break
            else:
                print 'VA = line[%d:]' % coloffset
                print 'VA :  %r' % VA
                print '  yy != yyi  on:'
                print 'VAshort : %r' % VAshort
                raw_input('  **** UNIMAGINABLE CASE ***')

    else:
        return line[0:coloffset] + repl



def subst_assigns_vals_in_text(text,repl,
                               rgx = re.compile('\A([ \t]*)(.*)')):

    def yi(text):
        for line in text.splitlines():
            head,line = rgx.search(line).groups()
            try:
                body = ast.parse(line,'line','exec').body
            except:
                yield head + line
            else:   
                if isinstance(body,list):
                    if len(body)==0:
                        yield head + line
                    elif len(body)==1:
                        if type(body[0])==ast.Assign:
                            yield head + subst_assign_val_in_line(line,
                                                                  body[0],
                                                                  repl)
                        else:
                            yield head + line
                    else:
                        print "list ast.parse(line,'line','exec').body has more than 1 element"
                        print body
                        exit()
                else:
                    print "ast.parse(line,'line','exec').body is not a list"
                    print body
                    exit()

    return '\n'.join(yi(text))

print subst_assigns_vals_in_text(su,repl='Hello')

实际上,我写了它,并附有说明print和个人程序的帮助,以可读的方式显示AST树(对我而言)。
以下是仅print说明的代码,以遵循此过程:

import re,ast       
from sys import exit

su = '''# it's nothing
import re
def funcg(a,b):\r
    print a*b + 900
x = "abc#ghi"\t\t# comment
k = 103
dico["abc#12"] = [(x,x//3==0) for x in xrange(25) if x !=12]
dico["ABC#12"] = 45   # comment
a = 'lulu#88'
dico["mu=$*"] = 'mouth#30'  #ohoh
log_File = a.log
y = b.log ### x = a.log  
'''

print su
print '#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-'

def subst_assign_val_in_line(line,b0,repl):
    assert(isinstance(b0,ast.AST))
    print '\n%%%%%%%%%%%%%%%%\nline :  %r' % line
    print '\nb0 == body[0]: ',b0
    print '\nb0.value: ',b0.value
    print '\nb0.value.col_offset==',b0.value.col_offset
    coloffset = b0.value.col_offset
    VA = line[coloffset:]
    try:
        yy = compile(VA+'\n',"-expr-",'eval')
    except: # because of a bug of ast in computing VA
        coloffset = coloffset - 1
        VA = line[coloffset:]
        yy = compile(VA+'\n',"-expr-",'eval')
    print 'VA = line[%d:]' % coloffset
    print 'VA :  %r' % VA
    print ("yy = compile(VA+'\\n',\"-expr-\",'eval')\n"
           'yy =='),yy
    gen = ((i,c) for i,c in enumerate(VA) if c=='#')
    deb = ("mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmw\n"
           "    mwmwmwm '#' in VA  mwmwmwm\n")
    for i,c in gen:
        print '%si == %d   VA[%d] == %r' % (deb,i,i,c)
        deb = ''
        VAshort = VA[0:i] # <== cuts in front of a # character
        print '  VAshort = VA[0:%d] == %r' % (i,VAshort)
        try:
            yyi = compile(VAshort+'\n',"-exprshort-",'eval')
        except:
            print "  compile(%r+'\\n',\"-exprshort-\",'eval') gives error" % VAshort
        else:
            print ("  yyi = compile(VAshort+'\\n',\"-exprshort-\",'eval')\n"
                   '  yyi =='),yy
            if yy==yyi:
                print '  yy==yyi   Real value of assignement found'
                print "mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmw"
                return (line[0:coloffset] + repl + ' ' +
                        line[coloffset+i:])
                break
            else:
                print 'VA = line[%d:]' % coloffset
                print 'VA :  %r' % VA
                print '  yy != yyi  on:'
                print 'VAshort : %r' % VAshort
                raw_input('  **** UNIMAGINABLE CASE ***')
    else:
        return line[0:coloffset] + repl


def subst_assigns_vals_in_text(text,repl,
                               rgx = re.compile('\A([ \t]*)(.*)')):

    def yi(text):
        for line in text.splitlines():
            raw_input('\n\npause')
            origline = line
            head,line = rgx.search(line).groups()
            print ('#########################################\n'
                   '#########################################\n'
                   'line     : %r\n'
                   'cut line : %r' % (origline,line))
            try:
                body = ast.parse(line,'line','exec').body
            except:
                yield head + line
            else:
                if isinstance(body,list):
                    if len(body)==0:
                        yield head + line
                    elif len(body)==1:
                        if type(body[0])==ast.Assign:
                            yield head + subst_assign_val_in_line(line,
                                                                  body[0],
                                                                  repl)
                        else:
                            yield head + line
                    else:
                        print "list ast.parse(line,'line','exec').body has more than 1 element"
                        print body
                        exit()
                else:
                    print "ast.parse(line,'line','exec').body is not a list"
                    print body
                    exit()

    #in place of return '\n'.join(yi(text)) , to print the output
    def returning(text):
        for output in yi(text):
            print 'output   : %r' % output
            yield output

    return '\n'.join(returning(text))


print '\n\n\n%s' % subst_assigns_vals_in_text(su,repl='Hello')

我不做任何解释,因为要解释ast.parse()创建的代码的AST树的结构ast.parse() 如果被问到,我会在我的代码上给出一些提示

注意ast.parse()的功能在给某些节点开始的行和列时存在一个错误,因此我不得不通过附加的指令行来纠正这一点。
例如,它在列表理解上给出了错误的结果。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM