简体   繁体   English

使用Regex模式替换文本块(在java中)

[英]Replace a block of text using a pattern of Regex (in java)

I'm trying to remove a block of texts from a file using Regular Expressions. 我正在尝试使用正则表达式从文件中删除一个文本块。 Now I have the content of the file in one String but the Matcher cannot find the pattern. 现在我将文件的内容放在一个StringMatcher找不到该模式。 The example file is: 示例文件是:

\begin{comment}
this block should be removed
i.e. it need to be replaced
\end{comment}
this block should remains.
\begin{comment}
this should be removed too.
\end{comment}

I need to find the blocks starting with \\begin{comment} and ending with \\end{comment} , and then remove them. 我需要找到以\\begin{comment}并以\\end{comment}结尾的块,然后删除它们。 This is the minimal code that I used. 这是我使用的最小代码。 The regex that I'm using is \\\\begin\\{.*?\\\\end\\{comment\\} which should find and pattern starting with '\\begin' until the first occurrence of '\\end{comment}'. 我正在使用的正则表达式是\\\\begin\\{.*?\\\\end\\{comment\\} ,应该找到并以'\\ begin'开头的模式,直到第一次出现'\\ end {comment}'。 I worked in Notepad++. 我在Notepad ++工作。

However using this java code, it find the first '\\begin' and last '\\end' lines and remove every thing in between. 但是使用这个java代码,它会找到第一个'\\ begin'和最后'\\ end'行并删除它们之间的所有内容。 I want to keep the line which are not within the blocks. 我想保留不在块的线。

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class main {
    public static void main(String[] args) {
        String output;
        String s =  "\\begin{comment}\n"+
        "this block should be removed\n"+
        "i.e. it need to be replaced\n"+
        "\\end{comment}\n"+
        "this block should remains.\n"+
        "\\begin{comment}\n"+
        "this should be removed too.\n"+
        "\\end{comment}";
        Matcher m = Pattern.compile("\\\\begin\\{comment(?s).*\\\\end\\{comm.*?\\}").matcher(s);
        while(m.find())
        {
            System.out.println(m.group(0));
            output = m.replaceAll("");
        }

        m = Pattern.compile("\\begin").matcher(s);
        while(m.find())
        {
            System.out.println(m.group(0));
            output = m.replaceAll("");
        }
    }
}

Update: 更新:

I used this online tool to find it. 我用这个在线工具找到它。 Matcher m = Pattern.compile("\\\\begin\\{comment(?s). \\\\end\\{comm. ?\\}").matcher(s); Matcher m = Pattern.compile(“\\\\ begin \\ {comment(?s)。 \\\\ end \\ {comm。?\\}”)。matcher(s );

You have to fix your code in 2 points: 你必须在2点修复你的代码:

  1. The pattern should be consistent with your Notepad++ equivalent, the star should be followed by ? 该模式应该与您的Notepad ++等效,该应该跟? to be lazy : 懒惰

     Matcher m = Pattern.compile("\\\\\\\\begin\\\\{comment}(?s).*?\\\\\\\\end\\\\{comment}").matcher(s); -------------------------------------------------------^ 

Note that this pattern works correctly only if no nested comment section exists. 请注意,仅当不存在嵌套注释部分时,此模式才能正常工作。

  1. The latter fix regards the logic: if you call the matcher replaceAll function it replaces every matching section when executed (already at the first m.find() loop execution). 后一个修复关系到逻辑:如果你调用匹配器replaceAll函数,它会在执行时替换每个匹配的部分(已经在第一个m.find()循环执行)。 If you need the loop to inspect every comment block replace it with: 如果您需要循环来检查每个注释块,请将其替换为:

     output = m.replaceFirst(""); 

    or simply apply output = m.replaceAll(""); 或者只是应用output = m.replaceAll(""); without any loop at all. 没有任何循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM