[英]Ask for a regex for re.sub in python
I have some string like these, there will be 0 or more whitespace before or after =
, and there will be 0 or 1 ### comment
at the end of the string. 我有一些像这样的字符串,在
=
之前或之后将有0或多个空格,并且在字符串的末尾将有0或1 ### comment
。
log_File = a.log ### the path for log
log_level = 10
Now I want to replace the string on the right of =
. 现在,我想替换
=
右边的字符串。 for example, set them to as below: 例如,将它们设置为如下所示:
log_File = b.log ### the path for log
log_level = 40
import re
s="log_File = a.log ### the path for log"
re.sub("(?<=\s)\w+\S+",'Hello",s)
The above code replaces all the strings after = to Hello, I don't want to replace the strings after ###
, how can I implement this. 上面的代码将=之后的所有字符串替换为Hello,我不想替换
###
之后的字符串,该如何实现。
Try following code: 尝试以下代码:
>>> re.sub(r'(?<!#)=(.*?)(?=\s*#|$)', r'= Hello', s, 1)
'log_File = Hello ### the path for log'
Without using regular expression (Inbar Rose's version modified) 不使用正则表达式(已修改Inbar Rose的版本)
def replace_value(s, new):
content, sep1, comment = s.partition('#')
key, sep2, value = content.partition('=')
if sep2: content = key + sep2 + new
return content + sep1 + comment
assert replace_value('log_File = b', ' Hello') == 'log_File = Hello'
assert replace_value('#log_File = b', ' Hello') == '#log_File = b'
assert replace_value('#This is comment', ' Hello') == '#This is comment'
assert replace_value('log_File = b # hello', ' Hello') == 'log_File = Hello# hello'
I don't see where's the problem. 我看不出问题出在哪里。
What about the following code ? 下面的代码呢?
import re
pat = '(=\s*).+?(?=\s*(#|$))'
rgx = re.compile(pat,re.MULTILINE)
su = '''log_File = a.log ### the path for log
log_File = a.log
log_File = a.log'''
print su
print
print rgx.sub('\\1Hello',su)
. 。
. 。
I have seen where's the problem ! 我已经看到了问题所在!
As I wrote it, I don't think that the problem can be simply solved by only regex or a relatively simple function, because changing the right-part of an assignement (attribute called value in the AST node of an assignement) without touching the possible comment requires a syntactic analyze to determine what is the left-part of an assignement (attribute called targets in an AST node of an assignement), what is the right-part and what is the possible comment in a line. 正如我写的那样,我认为仅靠正则表达式或相对简单的函数就无法解决问题,因为在不触及分配的情况下更改分配的右侧部分(分配的AST节点中称为value的属性)的问题可能的注释需要进行语法分析,以确定什么是分配的左侧部分(分配的AST节点中称为目标的属性),什么是右侧以及一行中可能的注释是什么。 And even if a line isn't an assignement instruction, a syntactic analyze is required to determine it.
即使一行不是分配指令,也需要通过语法分析来确定它。
For such a task there's only the module ast
, which helps Python applications to process trees of the Python abstract syntax grammar , that can provide the tools to achieve the goal, in my opinion. 对于这样的任务,只有模块
ast
,它可以帮助Python应用程序处理Python抽象语法语法的树 ,我认为它可以提供实现目标的工具。
Here's the code I succeeded to write on this idea: 这是我根据这个想法成功编写的代码:
import re,ast
from sys import exit
su = '''# it's nothing
import re
def funcg(a,b):\r
print a*b + 900
x = "abc#ghi"\t\t# comment
k = 103
dico["abc#12"] = [(x,x//3==0) for x in xrange(25) if x !=12]
dico["ABC#12"] = 45 # comment
a = 'lulu#88'
dico["mu=$*"] = 'mouth#30' #ohoh
log_File = a.log
y = b.log ### x = a.log
'''
print su
def subst_assign_val_in_line(line,b0,repl):
assert(isinstance(b0,ast.AST))
coloffset = b0.value.col_offset
VA = line[coloffset:]
try:
yy = compile(VA+'\n',"-expr-",'eval')
except: # because of a bug of ast in computing VA
coloffset = coloffset - 1
VA = line[coloffset:]
yy = compile(VA+'\n',"-expr-",'eval')
gen = ((i,c) for i,c in enumerate(VA) if c=='#')
for i,c in gen:
VAshort = VA[0:i] # <== cuts in front of a # character
try:
yyi = compile(VAshort+'\n',"-exprshort-",'eval')
except:
pass
else:
if yy==yyi:
return (line[0:coloffset] + repl + ' ' +
line[coloffset+i:])
break
else:
print 'VA = line[%d:]' % coloffset
print 'VA : %r' % VA
print ' yy != yyi on:'
print 'VAshort : %r' % VAshort
raw_input(' **** UNIMAGINABLE CASE ***')
else:
return line[0:coloffset] + repl
def subst_assigns_vals_in_text(text,repl,
rgx = re.compile('\A([ \t]*)(.*)')):
def yi(text):
for line in text.splitlines():
head,line = rgx.search(line).groups()
try:
body = ast.parse(line,'line','exec').body
except:
yield head + line
else:
if isinstance(body,list):
if len(body)==0:
yield head + line
elif len(body)==1:
if type(body[0])==ast.Assign:
yield head + subst_assign_val_in_line(line,
body[0],
repl)
else:
yield head + line
else:
print "list ast.parse(line,'line','exec').body has more than 1 element"
print body
exit()
else:
print "ast.parse(line,'line','exec').body is not a list"
print body
exit()
return '\n'.join(yi(text))
print subst_assigns_vals_in_text(su,repl='Hello')
In fact, I wrote it accompanied with instructions print
and the help of a personnal programm to display an AST tree in a readable manner (for me). 实际上,我写了它,并附有说明
print
和个人程序的帮助,以可读的方式显示AST树(对我而言)。
Here after is the code with instructions print
only, to follow the process: 以下是仅
print
说明的代码,以遵循此过程:
import re,ast
from sys import exit
su = '''# it's nothing
import re
def funcg(a,b):\r
print a*b + 900
x = "abc#ghi"\t\t# comment
k = 103
dico["abc#12"] = [(x,x//3==0) for x in xrange(25) if x !=12]
dico["ABC#12"] = 45 # comment
a = 'lulu#88'
dico["mu=$*"] = 'mouth#30' #ohoh
log_File = a.log
y = b.log ### x = a.log
'''
print su
print '#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-'
def subst_assign_val_in_line(line,b0,repl):
assert(isinstance(b0,ast.AST))
print '\n%%%%%%%%%%%%%%%%\nline : %r' % line
print '\nb0 == body[0]: ',b0
print '\nb0.value: ',b0.value
print '\nb0.value.col_offset==',b0.value.col_offset
coloffset = b0.value.col_offset
VA = line[coloffset:]
try:
yy = compile(VA+'\n',"-expr-",'eval')
except: # because of a bug of ast in computing VA
coloffset = coloffset - 1
VA = line[coloffset:]
yy = compile(VA+'\n',"-expr-",'eval')
print 'VA = line[%d:]' % coloffset
print 'VA : %r' % VA
print ("yy = compile(VA+'\\n',\"-expr-\",'eval')\n"
'yy =='),yy
gen = ((i,c) for i,c in enumerate(VA) if c=='#')
deb = ("mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmw\n"
" mwmwmwm '#' in VA mwmwmwm\n")
for i,c in gen:
print '%si == %d VA[%d] == %r' % (deb,i,i,c)
deb = ''
VAshort = VA[0:i] # <== cuts in front of a # character
print ' VAshort = VA[0:%d] == %r' % (i,VAshort)
try:
yyi = compile(VAshort+'\n',"-exprshort-",'eval')
except:
print " compile(%r+'\\n',\"-exprshort-\",'eval') gives error" % VAshort
else:
print (" yyi = compile(VAshort+'\\n',\"-exprshort-\",'eval')\n"
' yyi =='),yy
if yy==yyi:
print ' yy==yyi Real value of assignement found'
print "mwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmwmw"
return (line[0:coloffset] + repl + ' ' +
line[coloffset+i:])
break
else:
print 'VA = line[%d:]' % coloffset
print 'VA : %r' % VA
print ' yy != yyi on:'
print 'VAshort : %r' % VAshort
raw_input(' **** UNIMAGINABLE CASE ***')
else:
return line[0:coloffset] + repl
def subst_assigns_vals_in_text(text,repl,
rgx = re.compile('\A([ \t]*)(.*)')):
def yi(text):
for line in text.splitlines():
raw_input('\n\npause')
origline = line
head,line = rgx.search(line).groups()
print ('#########################################\n'
'#########################################\n'
'line : %r\n'
'cut line : %r' % (origline,line))
try:
body = ast.parse(line,'line','exec').body
except:
yield head + line
else:
if isinstance(body,list):
if len(body)==0:
yield head + line
elif len(body)==1:
if type(body[0])==ast.Assign:
yield head + subst_assign_val_in_line(line,
body[0],
repl)
else:
yield head + line
else:
print "list ast.parse(line,'line','exec').body has more than 1 element"
print body
exit()
else:
print "ast.parse(line,'line','exec').body is not a list"
print body
exit()
#in place of return '\n'.join(yi(text)) , to print the output
def returning(text):
for output in yi(text):
print 'output : %r' % output
yield output
return '\n'.join(returning(text))
print '\n\n\n%s' % subst_assigns_vals_in_text(su,repl='Hello')
I don't give explanations because it would be too long to explain the structure of the AST tree of a code created by ast.parse()
. 我不做任何解释,因为要解释
ast.parse()
创建的代码的AST树的结构ast.parse()
。 I xwill give some lights on my code if asked 如果被问到,我会在我的代码上给出一些提示
NB there's a bug in the functionning of ast.parse()
when it gives the line and column at which begins certain nodes, so I was obliged to correct that byadditional lines of instructions. 注意 :
ast.parse()
的功能在给某些节点开始的行和列时存在一个错误,因此我不得不通过附加的指令行来纠正这一点。
For example, it gives a false result on a list comprehension. 例如,它在列表理解上给出了错误的结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.