简体   繁体   中英

What is the best way to remove multiple occurences of a character in a string in java

I have a string like foo..txt and I want to convert it to foo.txt The occurence of '.' may be more than 2 also. What is the best way to accomplish this?

edit : The '.' may not occur just together. The occurences may be as below too

foo.bar.txt = foo bar.txt
foo..bar.foo.txt = foo bar.txt

With replaceAll() ! Like this:

string = string.replaceAll("\\.{2,}", ".")

Note that we had to escape the period, since it's a special character in regular expressions (and also escape the backslash, for Java's sake). Also note the {2,} , which means "match if it occurs two or more times".

I believe what you want is to replace all periods in the file name part with spaces, but keep the extension, right?

If so, something like this would be appropriate:

    String[] tests = {
        "foo.bar.txt",       // [foo bar.txt]
        "foo...bar.foo.txt", // [foo bar foo.txt]
        "........",          // [.]
        "...x...dat",        // [x.dat]
        "foo..txt",          // [foo.txt]
        "mmm....yummy...txt" // [mmm yummy.txt]
    };
    for (String test : tests) {
        int k = test.lastIndexOf('.');          
        String s = test.substring(0, k).replaceAll("\\.+", " ").trim()
           + test.substring(k);
        System.out.println("[" + s + "]");
    }

Essentially the way this works is:

  • First, find the lastIndexOf('.') in our string
    • Say this index is k , then we have logically separated our string into:
      • substring(0, k) , the prefix part
      • substring(k) , the suffix (file extension) part
  • Then we use regex on the prefix part to replaceAll matches of \\.+ with " "
    • That is, a literal dot \\. , repeated one or more times +
    • We also trim() this string to remove leading and trailing spaces
  • The result we want is the transformed prefix concatenated with the original suffix

Clarifications

  • The reason why the pattern is \\.+ instead of .+ is because the dot . is a regex metacharacter, but in this case we really mean a literal period, so it needs to be escaped as \\.
  • The reason why this pattern as a Java string literal is "\\\\.+" is because \\ is itself a Java string literal escape character. For example, the string literal "\\t" contains the tab character. Analogously, the string literal "\\\\" contains the backslash character; it has a length() of one.

References

You've made me read manuals :) I solved more general problem: how to replace any 2+ same characters one after another with only 1 same character:

String str = "assddffffadfdd..o";
System.out.println (str.replaceAll("(.)\\1+", "$1"));

Output:

asdfadfd.o

If you need a solution only for the case "filename....ext" then I'd prefer something simpler like in Etaoin's answer because it probably works faster (but not fact). My solution simplified for this concrete case looks like this:

str.replaceAll("(\\.)\\1+", "$1")
"file....txt".replaceAll("\\.\\.+",".")

正则表达式匹配所有出现的一个以上的点,并将其替换为单个点。

use replaceAll() like this:

string = string.replaceAll("\\.+(?=.*\\..*)", " ")

Paraphrasing the Regex from left to right:

  • "\\\\.+" Look for one or more periods
  • "(?=.*\\\\..*)" after looking ahead and finding a period

It handles the test case you mentioned - however, cases like:

  • txt.
  • test..txt
  • .test.txt

are converted as follows:

  • txt.
  • test .txt
  • test.txt

我建议String.replaceAll

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM