简体   繁体   中英

Replace all occurences of a word in a string using regex

Is there no easy way to replace all occurrences of a (whole) word in a string? I am using this currently and it is not very elegant:

public static String replace(String input, String toReplace, 
                           String replacement){
    if(input==null) throw new NullPointerException();
    input = input.replace(" "+toReplace+" ", " "+replacement+" ");
    input = input.replaceAll("^"+toReplace+" ", replacement+" ");
    input = input.replaceAll(" "+toReplace+"$", " "+replacement);
    return input;
}

Also, the regular expression "^"+toReplace+" " is not regex safe. For example: when it might contain a character like [ or ( etc.

Edit:

Any reasons why this code:

public static String replace(String input, String toReplace, 
                           String replacement){
    if(input==null) throw new NullPointerException();
    input = input.replace(" "+toReplace+" ", " "+replacement+" ");
    input = input.replaceAll(Pattern.quote("^"+toReplace+" "), replacement+" ");
    input = input.replaceAll(Pattern.quote(" "+toReplace+"$"), " "+replacement);
    //input = input.replaceAll("\\b" + Pattern.quote(toReplace) + "\\b", replacement);
    return input;
}

behaves this way when:

    input = "test a testtest te[(st string test";
    input = replace(input, toReplace, "REP");
    System.out.println(input);

a) toReplace = test prints:

test a testtest te[(st string test

b) toReplace = te[(st prints:

test a testtest REP string test

Thanks,

Use word boundaries \\b and Pattern.quote to escape.

return input.replaceAll("\\b" + Pattern.quote(toReplace) + "\\b", replacement);

What \\\\b indicates is a zero-width boundary between a word and a non-word character including the very start and very end of the string.

There is a special regexp code for word boundary - \\b . That covers your manual handlings of spaces/line endings beginning as well as other cases like punctuation.

There is a method Pattern.quote() to quote strings to protect regexp special inside which, as you have suggested, should always be used if the string is arbitrary or might be user-supplied.

So that gives:

input.replaceAll("\\b"+Pattern.quote(toReplace)+"\\b", replacement);
input = input.replaceAll("\\b"+Pattern.quote(toReplace)+"\\b", replacement);

\\b matches word boundaries, see http://www.regular-expressions.info/wordboundaries.html

Use java.util.regex.Pattern.quote to escape special characters.

You need to know about the regex \\b , which is a zero-width match of a "word boundary". With it, the guys of your method becomes simply one line:

return input.replaceAll("\\b"+Pattern.quote(toReplace)+"\\b", replacement);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM