简体   繁体   中英

Remove all empty lines

I thought that wasn't that hard to do, but I want to remove all empty lines (or lines just containing blanks and tabs in Java) with String.replaceAll.

My regex looks like this:

s = s.replaceAll ("^[ |\t]*\n$", "");

But it doesn't work.

I looked around, but only found regexes for removing empty lines without blanks or tabs.

Try this:

String text = "line 1\n\nline 3\n\n\nline 5";
String adjusted = text.replaceAll("(?m)^[ \t]*\r?\n", "");
// ...

Note that the regex [ |\\t] matches a space, a tab or a pipe char!

EDIT

Btw, the regex (?m)^\\s+$ would also do the trick.

I don't know the syntax for regular expressions in Java, but /^\\s*$[\\n\\r]{1,}/gm is the regex you're looking for.

You probably write it like this in Java:

s = s.replaceAll("(?m)^\\s*$[\n\r]{1,}", "");

I tested it with JavaScript and it works fine.

You can remove empty lines from your code using the following code:

String test = plainTextWithEmptyLines.replaceAll("[\\\r\\\n]+","");

Here, plainTextWithEmptyLines denotes the string having the empty lines. [\\\\\\r\\\\\\n] is the regex pattern which is used to identify empty line breaks.

I'm not a day-to-day Java programmer, so I'm surprised there isn't a simpler way to do this in the JDK than a regex.

Anyway,

s = s.replaceAll("\n+", "\n");

would be a bit simpler.

Update:

Sorry I missed that you wanted to also remove spaces and tabs.

s = s.replaceAll("\n[ \t]*\n", "\n");

Would work if you have consistent newlines. If not, you may want to consider making them consistent. Eg:

s = s.replaceAll("[\n\r]+", "\n");
s = s.replaceAll("\n[ \t]*\n", "\n");

Bart Kiers's answer is missing the edge case where the last line of the string is empty or contains whitespaces.

If you try

String text = "line 1\n\nline 3\n\n\nline 5\n "; // <-- Mind the \n plus space at the end!
String adjusted = text.replaceAll("(?m)^[ \t]*\r?\n", "");

you'll get a String that equals this

"line 1\nline 3\nline 5\n " // <-- MIND the \n plus space at the end!

as result.

I expanded Bart Kiers ' answer to also cover this case.

My regex pattern is:

String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";

A little explanation:

The first part of the pattern is basically the same as Bart Kiers '. It is fine, but it does not remove an "empty" last line or a last line containing whitespaces.

That is because a last line containing just whitespaces does not end with \\\\r?\\\\n and would therefore not be matched/replaced. We need something to express this edge case. That's where the second part (after the | ) comes in.

It uses a regular expression speciality: negative lookahead . That's the (?!.*\\\\r?\\\\n) part of the pattern. (?! marks the beginning of the lookahead. You could read it as: Match the regular expression before the lookahead if it is not followed by whatever is defined as string that must not follow. In our case: not any character (zero or more times) followed by a carriage-return (0 or 1 times) and a newline: .*\\\\r?\\\\n . The ) closes the lookahead. The lookahead itself is not part of the match.

If I execute the following code snippet:

String pattern = "(?m)^\\s*\\r?\\n|\\r?\\n\\s*(?!.*\\r?\\n)";
String replacement = "";
String inputString =
        "\n" +
        "Line  2 - above line is empty without spaces\n" +
        "Line  3 - next is empty without whitespaces\n" +
        "\n" +
        "Line  5 - next line is with whitespaces\n" +
        "        \n" +
        "Line  7 - next 2 lines are \"empty\". First one with whitespaces.\n" +
        "        \r\n" +
        "\n" +
        "Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line " +
        "\n" +
        "          \n" +
        "\n";

String ajdustedString = inputString.replaceAll(pattern, replacement);
System.out.println("inputString:");
System.out.println("+----");
System.out.println(inputString);
System.out.println("----+");
System.out.println("ajdustedString:");
System.out.println("+----");
System.out.print(ajdustedString); //MIND the "print" instead of "println"
System.out.println("|EOS"); //String to clearly mark the _E_nd _O_f the adjusted_S_tring
System.out.println("----+");

I get:

inputString:
+----

Line  2 - above line is empty without spaces
Line  3 - next is empty without whitespaces

Line  5 - next line is with whitespaces

Line  7 - next 2 lines are "empty". First one with whitespaces.


Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line



----+
ajdustedString:
+----
Line  2 - above line is empty without spaces
Line  3 - next is empty without whitespaces
Line  5 - next line is with whitespaces
Line  7 - next 2 lines are "empty". First one with whitespaces.
Line 10 - 3 empty lines follow. The 2nd one with whitespaces in it. One whitespace at the end of this line |EOS
----+

If you want to learn more about lookahead/lookbehind see Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions:

If want to remove the lines from Microsoft Office, Windows or an text editor which supports regular expression rendering:

 1. Press <kbd>Ctrl</kbd> + <kbd>F</kbd>.
 2. Check the regular expression checkbox
 3. Enter Expression ^\s*\n into the find box as it is.

You will see all you black spaces into your editor disappears...

I have some code without using regexp, just import org.apache.commons.lang3.StringUtils;

  File temporaire = new File("temp.txt");
  try {
    Scanner scanner = new Scanner(yourfile);
    BufferedWriter bw = new BufferedWriter(new FileWriter(temporaire));
    while (scanner.hasNextLine()) {
      String line = StringUtils.stripEnd(scanner.nextLine(),null); // Clean blanks at the end of the line
      if (StringUtils.isNotBlank(line)) {
        bw.write(line); // Keep the line only if not blank
        if (scanner.hasNextLine()){
          // Go to next line (Win,Mac,Unix) if there is one
          bw.write(System.getProperty("line.separator"));
        }
      }
      bw.flush();
    }
    scanner.close();
    bw.close();
    fichier.delete();
    temporaire.renameTo(fichier);
  }
  catch (FileNotFoundException e) {
    System.out.println(e.getMessage());
  }
  catch (IOException e) {
    System.out.println(e.getMessage());
  }
}

this method remove only empty lines by java:

private String removeEmptyLines(String text) {
    final String[] strings = text.split("\n");
    StringBuilder result = new StringBuilder();
    for (int i = 0, stringsLength = strings.length; i < stringsLength; i++) {
        String str = strings[i];
        if (str.isEmpty()) continue;
        result.append(str);
        if (i + 1 == stringsLength) continue;
        result.append("\n");
    }
    return result.toString();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM