[英]Perl not matching regex?
I'm trying to remove all the comments in a bunch of SGF files, and have come up with the following perl command: 我试图删除一堆SGF文件中的所有注释,并提出以下perl命令:
perl -pi -e 's/P?C\[(?:[^\]\\]++|\\.)*+\]//gm' *.sgf
I'm trying to match and remove a C or PC followed by a left bracket, then characters that aren't right brackets (if they are they have to be escaped with a \\
) and then a right bracket. 我正在尝试匹配并删除一个C或PC,然后是左括号,然后是不是右括号的字符(如果它们必须用
\\
来转义),然后是右括号。
I'm trying to match the following examples: 我正在尝试匹配以下示例:
C[HelloBot9 [-\\]: GTP Engine for HelloBot9 (white): HelloBot version 0.6.26.08]
PC[IA [-\]: GTP Engine for IA (black): GNU Go version 3.7.11
]
C[person [-\\]: \\\\\\]]
C[AyaMC [3k\]: GTP Engine for AyaMC (black): Aya version 6.61 : If you pass, AyaMC
will pass. When AyaMC does not, please remove all dead stones.]
And some examples that shouldn't be matched: 还有一些不应该匹配的例子:
XYZ[Other stuff \\]]
C[stuff\\]
PC[stuff\\\\\\]
The regex works in several online regex testers (including a few that state they are perl regex testers), but for some reason doesn't work on the command line. 正则表达式适用于几个在线正则表达式测试人员(包括一些表明他们是perl正则表达式测试人员),但由于某种原因在命令行上不起作用。 Help is appreciated.
感谢帮助。
You need to run perl
with -0777
option to make sure that contents spanning across lines and matching the pattern can be found. 您需要使用
-0777
选项运行perl
,以确保可以找到跨越行并匹配模式的内容。 So, using perl -0777pi -e
instead of perl -pi -e
will solve the issue. 因此,使用
perl -0777pi -e
而不是perl -pi -e
将解决问题。
I would also suggest optimizing the pattern a bit by unrolling the alternation group, thus, making matching process "linear": 我还建议通过展开交替组来优化模式,从而使匹配过程“线性”:
s/P?C\[[^]\\]*(?:\\.[^]\\]*+)*]//sg
Note that if PC
should be matched as a whole word, add \\b
before P
. 请注意,如果
PC
应作为整个单词匹配,请在P
之前添加\\b
。
Pattern details : 图案细节 :
P?C\\[
- either PC[
or C[
literal char sequence P?C\\[
- PC[
或C[
字面字符序列 [^]\\\\]*
- zero or more chars other than \\
and ]
[^]\\\\]*
-比其他零个或多个字符\\
和]
(?:\\\\.[^]\\\\]*+)*
- zero or more sequences of: (?:\\\\.[^]\\\\]*+)*
- 零个或多个序列:
\\\\.
- a literal \\
and then any char ( .
) \\
然后任何字符( .
) [^]\\\\]*+
- 0+ chars other than ]
and \\
(matched possessively, no backtracking into the pattern) [^]\\\\]*+
- 除了]
和\\
之外的0 [^]\\\\]*+
字符和\\
(占有率,没有回溯到模式中) ]
- a literal ]
symbol (note it does not have to be escaped outside the character class to denote a literal closing bracket) ]
- 一个文字]
符号(注意它不必在字符类之外转义以表示文字结束括号)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.