简体   繁体   中英

Java: String.replaceAll(regex, replacement);

I have a string of comma-separated user-ids and I want to eliminate/remove specific user-id from a string.

I've following possibilities of string and expected the result

int elimiateUserId = 11;

String css1 = "11,22,33,44,55";
String css2 = "22,33,11,44,55";
String css3 = "22,33,44,55,11";
// The expected result in all cases, after replacement, should be:
// "22,33,44,55"

I tried the following:

String result = css#.replaceAll("," + elimiateUserId, "");  // # =  1 or 2 or 3
result = css#.replaceAll(elimiateUserId + "," , "");

This logic fails in case of css3 . Please suggest me a proper solution for this issue.

Note : I'm working with Java 7

I checked around the following posts, but could not find any solution:

You can use the Stream API in Java 8:

int elimiateUserId = 11;
String css1 = "11,22,33,44,55";

String css1Result = Stream.of(css1.split(","))
    .filter(value -> !String.valueOf(elimiateUserId).equals(value))
    .collect(Collectors.joining(","));

// css1Result = 22,33,44,55

If you want to use regex, you may use (remember to properly escape as java string literal)

,\b11\b|\b11\b,

This will ensure that 11 won't be matched as part of another number due to the word boundaries and only one comma (if two are present) is matched and removed.

You may build a regex like

^11,|,11\b

that will match 11, at the start of a string ( ^11, ) or ( | ) ,11 not followed with any other word char ( ,11\\b ).

See the regex demo .

int elimiate_user_id = 11;
String pattern = "^" + elimiate_user_id + ",|," + elimiate_user_id + "\\b";
System.out.println("11,22,33,44,55,111".replaceAll(pattern, "")); // => 22,33,44,55,111
System.out.println("22,33,11,44,55,111".replaceAll(pattern, "")); // => 22,33,44,55,111 
System.out.println("22,33,44,55,111,11".replaceAll(pattern, "")); // => 22,33,44,55,111

See the Java demo

Try to (^(11)(?:,))|((?<=,)(11)(?:,))|(,11$) expression to replaceAll :

final String regexp = MessageFormat.format("(^({0})(?:,))|((?<=,)({0})(?:,))|(,{0}$)", elimiateUserId)
String result = css#.replaceAll(regexp, "") //for all cases.  

Here is an example: https://regex101.com/r/LwJgRu/3

You can use two replace in one shot like :

int elimiateUserId = 11;
String result = css#.replace("," + elimiateUserId , "").replace(elimiateUserId + ",", "");

If your string is like ,11 the the first replace will do replace it with empty
If your string is like 11, the the second replace will do replace it with empty

result

11,22,33,44,55      ->     22,33,44,55
22,33,11,44,55      ->     22,33,44,55
22,33,44,55,11      ->     22,33,44,55

ideone demo

try this:

String result = css#.replaceAll("," + elimiateUserId, "")
             .replaceAll(elimiateUserId + "," , "");
String result = css#.replaceAll("," + eliminate_user_id + "\b|\b" + eliminate_user_id + ",", '');

The regular expression here is:

,     A leading comma.
eliminate_user_id  I assumed the missing 'n' here was a typo.
\b    Word boundary: word/number characters end here.
|     OR
\b    Word boundary: word/number characters begin here.
eliminate_user_id again.
,     A trailing comma.

The word boundary marker, matching the beginning or end of a "word", is the magic here. It means that the 11 will match in these strings:

11,22,33,44,55
22,33,11,44,55
22,33,44,55,11 

But not these strings:

111,112,113,114
411,311,211,111

There's a cleaner way, though:

String result = css#.replaceAll("(,?)\b" + eliminate_user_id + "\b(?(1)|,)", "");

The regular expression here is:

(     A capturing group - what's in here, is in group 1.
,?    An optional leading comma.
)     End the capturing group.
\b    Word boundary: word/number characters begin here.
eliminate_user_id  I assumed the missing 'n' here was a typo.
\b    Word boundary: word/number characters end here.
(?(1) If there's something in group 1, then require...
|     ...nothing, but if there was nothing, then require...
,     A trailing comma.
)     end the if.

The "if" part here is a little unusual - you can find a little more information on regex conditionals here: http://www.regular-expressions.info/conditional.html

I am not sure if Java supports regex conditionals. Some posts here ( Conditional Regular Expression in Java? ) suggest that it does not :(


Side-note: for performance, if the list is VERY long and there are VERY many removals to be performed, the most obvious option is to just run the above line for each number to be removed:

String css = "11,22,33,44,55,66,77,88,99,1010,1111,1212,...";
Array<String> removals = ["11", "33", "55", "77", "99", "1212"];
for (i=0; i<removals.length; i++) {
  css = css.replaceAll("," + removals[i] + "\b|\b" + eliminate_user_id + ",", "");
}

(code not tested: don't have access to a Java compiler here)

This will be fast enough (worst case scales with about O(m*n) for m removals from a string of n ids), but we can maybe do better.

One is to build the regex to be \\b(11,42,18,13,123,...etc)\\b - that is, make the regex search for all ids to be removed at the same time. In theory this scales a little worse, scaling with O(m*n) in every case rather than jut the worst case, but in practice should be considerably faster.

String css = "11,22,33,44,55,66,77,88,99,1010,1111,1212,...";
Array<String> removals = ["11", "33", "55", "77", "99", "1212"];
String removalsStr = String.join("|", removals);
css = css.replaceAll("," + removalsStr + "\b|\b" + removalsStr + ",", "");

But another approach might be to build a hashtable of the ids in the long string, then remove all the ids from the hashtable, then concatenate the remaining hashtable keys back into a string. Since hashtable lookups are effectively O(1) for sparse hashtables, that makes this scale with O(n). The tradeoff here is the extra memory for that hashtable, though.

(I don't think I can do this version without a java compiler handy. I would not recommend this approach unless you have a VAST (many thousands) list of IDs to remove, anyway, as it will be much uglier and more complex code).

I think its safer to maintain a whitelist and then use it as a reference to make further changes.

List<String> whitelist = Arrays.asList("22", "33", "44", "55");
String s = "22,33,44,55,11";
String[] sArr = s.split(",");
StringBuilder ids = new StringBuilder();
for (String id : sArr) {
    if (whitelist.contains(id)) {
        ids.append(id).append(", ");
    }
}
String r = ids.substring(0, ids.length() - 2);
System.out.println(r);

If you need a solution with Regex, then the following works perfectly.

    int elimiate_user_id = 11;

    String css1 = "11,22,33,44,55";
    String css2 = "22,33,11,44,55";   
    String css3 = "22,33,44,55,11";

    String resultCss=css1.replaceAll(elimiate_user_id+"[,]*", "").replaceAll(",$", "");

I works with all types of input you desire.

This should work

replaceAll("(11,|,11)", "")

At least when you can guarantee when there is no 311, or ,113 or so

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM