简体   繁体   English

Java:String.replaceAll(regex,替换);

[英]Java: String.replaceAll(regex, replacement);

I have a string of comma-separated user-ids and I want to eliminate/remove specific user-id from a string. 我有一个用逗号分隔的用户ID字符串,我想从字符串中删除/删除特定的用户ID。

I've following possibilities of string and expected the result 我正在跟踪字符串的可能性并期望结果

int elimiateUserId = 11;

String css1 = "11,22,33,44,55";
String css2 = "22,33,11,44,55";
String css3 = "22,33,44,55,11";
// The expected result in all cases, after replacement, should be:
// "22,33,44,55"

I tried the following: 我尝试了以下方法:

String result = css#.replaceAll("," + elimiateUserId, "");  // # =  1 or 2 or 3
result = css#.replaceAll(elimiateUserId + "," , "");

This logic fails in case of css3 . 如果使用css3此逻辑失败。 Please suggest me a proper solution for this issue. 请建议我针对此问题的适当解决方案。

Note : I'm working with Java 7 注意 :我正在使用Java 7

I checked around the following posts, but could not find any solution: 我检查了以下帖子,但找不到任何解决方案:

You can use the Stream API in Java 8: 您可以在Java 8中使用Stream API:

int elimiateUserId = 11;
String css1 = "11,22,33,44,55";

String css1Result = Stream.of(css1.split(","))
    .filter(value -> !String.valueOf(elimiateUserId).equals(value))
    .collect(Collectors.joining(","));

// css1Result = 22,33,44,55

If you want to use regex, you may use (remember to properly escape as java string literal) 如果要使用正则表达式,则可以使用(请记住,以Java字符串文字形式正确转义)

,\b11\b|\b11\b,

This will ensure that 11 won't be matched as part of another number due to the word boundaries and only one comma (if two are present) is matched and removed. 这将确保由于单词边界而不会将11作为另一个数字的一​​部分进行匹配,并且仅会匹配并删除一个逗号(如果存在两个逗号)。

You may build a regex like 您可以构建一个正则表达式

^11,|,11\b

that will match 11, at the start of a string ( ^11, ) or ( | ) ,11 not followed with any other word char ( ,11\\b ). 在字符串( ^11, )或( |,11的开头将匹配11,之后不跟其他任何字符char( ,11\\b )。

See the regex demo . 参见regex演示

int elimiate_user_id = 11;
String pattern = "^" + elimiate_user_id + ",|," + elimiate_user_id + "\\b";
System.out.println("11,22,33,44,55,111".replaceAll(pattern, "")); // => 22,33,44,55,111
System.out.println("22,33,11,44,55,111".replaceAll(pattern, "")); // => 22,33,44,55,111 
System.out.println("22,33,44,55,111,11".replaceAll(pattern, "")); // => 22,33,44,55,111

See the Java demo 参见Java演示

Try to (^(11)(?:,))|((?<=,)(11)(?:,))|(,11$) expression to replaceAll : 尝试(^(11)(?:,))|((?<=,)(11)(?:,))|(,11$)表达式replaceAll

final String regexp = MessageFormat.format("(^({0})(?:,))|((?<=,)({0})(?:,))|(,{0}$)", elimiateUserId)
String result = css#.replaceAll(regexp, "") //for all cases.  

Here is an example: https://regex101.com/r/LwJgRu/3 这是一个示例: https : //regex101.com/r/LwJgRu/3

You can use two replace in one shot like : 您可以像这样使用两次替换:

int elimiateUserId = 11;
String result = css#.replace("," + elimiateUserId , "").replace(elimiateUserId + ",", "");

If your string is like ,11 the the first replace will do replace it with empty 如果您的字符串是,11 ,则第一个替换项将替换为空
If your string is like 11, the the second replace will do replace it with empty 如果您的字符串是11,则第二个替换将替换为空

result 结果

11,22,33,44,55      ->     22,33,44,55
22,33,11,44,55      ->     22,33,44,55
22,33,44,55,11      ->     22,33,44,55

ideone demo ideone演示

try this: 尝试这个:

String result = css#.replaceAll("," + elimiateUserId, "")
             .replaceAll(elimiateUserId + "," , "");
String result = css#.replaceAll("," + eliminate_user_id + "\b|\b" + eliminate_user_id + ",", '');

The regular expression here is: 这里的正则表达式是:

,     A leading comma.
eliminate_user_id  I assumed the missing 'n' here was a typo.
\b    Word boundary: word/number characters end here.
|     OR
\b    Word boundary: word/number characters begin here.
eliminate_user_id again.
,     A trailing comma.

The word boundary marker, matching the beginning or end of a "word", is the magic here. 匹配“单词”开头或结尾的单词边界标记是这里的魔力。 It means that the 11 will match in these strings: 这意味着11将在以下字符串中匹配:

11,22,33,44,55
22,33,11,44,55
22,33,44,55,11 

But not these strings: 但不是这些字符串:

111,112,113,114
411,311,211,111

There's a cleaner way, though: 不过,有一种更清洁的方法:

String result = css#.replaceAll("(,?)\b" + eliminate_user_id + "\b(?(1)|,)", "");

The regular expression here is: 这里的正则表达式是:

(     A capturing group - what's in here, is in group 1.
,?    An optional leading comma.
)     End the capturing group.
\b    Word boundary: word/number characters begin here.
eliminate_user_id  I assumed the missing 'n' here was a typo.
\b    Word boundary: word/number characters end here.
(?(1) If there's something in group 1, then require...
|     ...nothing, but if there was nothing, then require...
,     A trailing comma.
)     end the if.

The "if" part here is a little unusual - you can find a little more information on regex conditionals here: http://www.regular-expressions.info/conditional.html 这里的“ if”部分有点不寻常-您可以在这里找到有关正则表达式条件的更多信息: http : //www.regular-expressions.info/conditional.html

I am not sure if Java supports regex conditionals. 我不确定Java是否支持正则表达式条件。 Some posts here ( Conditional Regular Expression in Java? ) suggest that it does not :( 这里的一些帖子( Java中的条件正则表达式? )建议它不:(


Side-note: for performance, if the list is VERY long and there are VERY many removals to be performed, the most obvious option is to just run the above line for each number to be removed: 旁注:为了提高性能,如果列表很长并且要执行很多删除操作,最明显的选择是对每个要删除的数字运行上面的行:

String css = "11,22,33,44,55,66,77,88,99,1010,1111,1212,...";
Array<String> removals = ["11", "33", "55", "77", "99", "1212"];
for (i=0; i<removals.length; i++) {
  css = css.replaceAll("," + removals[i] + "\b|\b" + eliminate_user_id + ",", "");
}

(code not tested: don't have access to a Java compiler here) (未经测试的代码:此处无法访问Java编译器)

This will be fast enough (worst case scales with about O(m*n) for m removals from a string of n ids), but we can maybe do better. 这将足够快(最坏的情况下缩放比例约为O(m * n),用于从n个id的字符串中去除m个),但是我们可以做得更好。

One is to build the regex to be \\b(11,42,18,13,123,...etc)\\b - that is, make the regex search for all ids to be removed at the same time. 一种是将正则表达式构建为\\b(11,42,18,13,123,...etc)\\b也就是说,使正则表达式搜索要同时删除的所有ID。 In theory this scales a little worse, scaling with O(m*n) in every case rather than jut the worst case, but in practice should be considerably faster. 从理论上讲,这种缩放比例会稍差一些,在每种情况下均以O(m * n)进行缩放,而不是在最坏的情况下进行缩放,但实际上应该更快。

String css = "11,22,33,44,55,66,77,88,99,1010,1111,1212,...";
Array<String> removals = ["11", "33", "55", "77", "99", "1212"];
String removalsStr = String.join("|", removals);
css = css.replaceAll("," + removalsStr + "\b|\b" + removalsStr + ",", "");

But another approach might be to build a hashtable of the ids in the long string, then remove all the ids from the hashtable, then concatenate the remaining hashtable keys back into a string. 但是另一种方法可能是建立长字符串中ID的哈希表,然后从哈希表中删除所有ID,然后将其余哈希表键连接回字符串中。 Since hashtable lookups are effectively O(1) for sparse hashtables, that makes this scale with O(n). 由于哈希表查找对于稀疏哈希表实际上是O(1),因此可以使用O(n)进行扩展。 The tradeoff here is the extra memory for that hashtable, though. 不过,这里的权衡是该哈希表的额外内存。

(I don't think I can do this version without a java compiler handy. I would not recommend this approach unless you have a VAST (many thousands) list of IDs to remove, anyway, as it will be much uglier and more complex code). (我认为没有Java编译器就无法实现此版本。除非您有要删除的VAST(成千上万)ID列表,否则我不建议您使用此方法,因为这将使代码更加丑陋和复杂)。

I think its safer to maintain a whitelist and then use it as a reference to make further changes. 我认为维护白名单,然后将其用作进行进一步更改的参考更为安全。

List<String> whitelist = Arrays.asList("22", "33", "44", "55");
String s = "22,33,44,55,11";
String[] sArr = s.split(",");
StringBuilder ids = new StringBuilder();
for (String id : sArr) {
    if (whitelist.contains(id)) {
        ids.append(id).append(", ");
    }
}
String r = ids.substring(0, ids.length() - 2);
System.out.println(r);

If you need a solution with Regex, then the following works perfectly. 如果您需要使用正则表达式的解决方案,那么以下方法非常适用。

    int elimiate_user_id = 11;

    String css1 = "11,22,33,44,55";
    String css2 = "22,33,11,44,55";   
    String css3 = "22,33,44,55,11";

    String resultCss=css1.replaceAll(elimiate_user_id+"[,]*", "").replaceAll(",$", "");

I works with all types of input you desire. 我可以处理您想要的所有类型的输入。

This should work 这应该工作

replaceAll("(11,|,11)", "")

At least when you can guarantee when there is no 311, or ,113 or so 至少可以保证何时没有311或,113左右

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM