[英]Regular Expression To Find C Style Comments
I am trying to write a regular expression to find C style headers in Java source files. 我试图编写一个正则表达式以在Java源文件中找到C样式的标头。 At the present time I am experimenting with this with Python.
目前,我正在使用Python进行此实验。
Here is my source code: 这是我的源代码:
import re
text = """/*
* Copyright blah blah blha blah
* blah blah blah blah
* 2008 blah blah blah @ org
*/"""
print
print "I guess the program printed the correct thing."
pattern = re.compile("^/.+/$")
print "-----------"
print pattern
pos = 0
while True:
match = pattern.search(text, pos)
if not match:
break
s = match.start()
e = match.end()
print ' %2d : %2d = "%s"' % (s, e-1, text[s:e])
pos = e
I am trying to write a simple expression that just looks for anything between a forward slash and another forward slash. 我正在尝试编写一个简单的表达式,该表达式只查找正斜线和另一个正斜线之间的任何内容。 I can make the regular expression more complicated later.
以后我可以使正则表达式更复杂。
Does anyone know where I am going wrong? 有人知道我要去哪里错吗? I am using a forward slash the dot meta-character, the plus symbol for 1 or more things, and the dollar symbol for the end.
我在正斜杠上使用点元字符,用于1个或多个事物的加号和结束的美元符号。
I don't think you should anchor (using '^' and '$') the match. 我认为您不应该锚定比赛(使用“ ^”和“ $”)。
Secondly, I think the regex should be r"/[^/]*/"
which matches a (portion of) a string that starts with a slash, followed by zero or more non-slash characters and then terminates with a slash. 其次,我认为正则表达式应该是
r"/[^/]*/"
,它匹配以斜杠开头,后跟零个或多个非斜杠字符,然后以斜杠终止的字符串(的一部分)。
To wit: 以机智:
>>> import re
>>> text = """foo bar baz
... /*
... * Copyright blah blah blha blah
... * blah blah blah blah
... * 2008 blah blah blah @ org
... */"""
>>> rx = re.compile(r"/[^/]*/", re.DOTALL)
>>> mo = rx.search(text)
>>> text[mo.start(): mo.end()]
'/*\n * Copyright blah blah blha blah \n * blah blah blah blah \n * 2008 blah blah blah @ org\n */'
Note that the comment does not start a the start of the string but that the regex finds it nicely. 注意,注释不是以字符串的开头开始,而是正则表达式很好地找到了它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.