简体   繁体   English

String.replace() 不替换所有出现

[英]String.replace() not replacing all occurrences

I have a very long string which looks similar to this.我有一个很长的字符串,看起来与此类似。

355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,....

When I tried using the following code to remove the number 382 from the string.当我尝试使用以下代码从字符串中删除数字 382 时。

String str = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399,...."
str = str.replace(",382,", ",");

But it seems that not all occurrences are being replaced.但似乎并非所有事件都被替换。 The string which originally had above 3000 occurrences still was left with about 630 occurrences after replacing.原来出现次数超过 3000 次的字符串在替换后仍然出现了大约 630 次。

Is the capability of String.replace() limited? String.replace() 的能力是否有限? If so, is there a possible way of achieving what I need?如果是这样,是否有可能实现我需要的方法?

I think the issue is your first argument to replace() , in particular the comma (,) before and after 382. If you have "382,382,383", you will only match the inner ",382," and leave the initial one behind. 我认为问题是你的第一个参数replace() ,特别是382之前之后的逗号(,)。如果你有“382,382,383”,你将只匹配内部“,382”,并留下最初的一个。 Try: 尝试:

str.replace("382,", "");

Although this will fail to match "382" at the very end as it does not have a comma after it. 虽然在最后它将无法匹配“382”,因为它之后没有逗号。

A full solution might entail two method calls thus: 完整的解决方案可能需要两个方法调用:

str = str.replace("382", "");  // Remove all instances of 382
str.replaceAll(",,+", ",");    // Compress all duplicates, triplicates, etc. of commas

This combines the two approaches: 这结合了两种方法:

str.replaceAll("382,?", "");  // Remove 382 and an optional comma after it. 

Note: both of the last two approaches leave a trailing comma if 382 is at the end. 注意:如果382在结尾,则后两种方法都会留下尾随逗号。

You need to replace the trailing comma as well (if one exists, which it won't if last in the list): 您还需要替换尾随逗号(如果存在,如果列表中的最后一个则不会):

str = str.replaceAll("\\b382,?", "");

Note \\b word boundary to prevent matching "-,1382,-" . 注意\\b字边界以防止匹配"-,1382,-"

The above will convert: 以上将转换:

382,111,382,1382,222,382

to: 至:

111,1382,222

尝试这个

str = str.replaceAll(",382,", ",");

Firstly, remove the preceding comma in your matching string. 首先,删除匹配字符串中的前一个逗号。 Then, remove duplicated commas by replacing commas with a single comma using java regular expression. 然后,通过使用java正则表达式将逗号替换为单个逗号来删除重复的逗号。

 String input = "355,356,357,358,359,360,361,382,363,364,365,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,360,361,363,366,368,369,313,370,371,372,373,374,375,376,377,378,379,380,381,382,382,382,382,382,382,383,384,385,380,381,382,382,382,382,382,386,387,388,389,380,381,382,382,382,382,382,382,390,391,380,381,382,382,382,382,382,392,393,394,395,396,397,398,399";
    String result = input.replace("382,", ","); // remove the preceding comma
    String result2 = result.replaceAll("[,]+", ","); // replace duplicate commas

    System.out.println(result2);

As dave already said, the problem is that your pattern overlaps. 正如戴夫已经说过的,问题是你的模式重叠了。 In the string "...,382,382,..." there are two occurrences of ",382," : 在字符串"...,382,382,..."有两次出现",382,"

"...,382,382,..."
    -----         first occurrence
        -----     second occurrence

These two occurrences overlap at the comma, and thus Java can only replace one of them. 这两个匹配项在逗号处重叠,因此Java只能替换其中一个。 When finding occurrences, it does not see yet what you replace the pattern with, and thus it doesn't see that new occurrence of ",382," is generated when replacing the first occurrence is replaced by the comma. 当找到事件时,它还没有看到你用什么替换模式,因此当替换第一个匹配项被逗号替换时",382,"它不会看到新的",382,"出现。

If your data is known not to contain numbers with more than 3 digits, then you might do: 如果您的数据已知不包含超过3位数的数字,那么您可能会:

str.replace("382,", "");

and then handle occurrences at the end as a special case. 然后作为特例处理最后的事件。 But if your data can contain big numbers, then "...,1382,..." will be replaced by "...,1,..." which probably is not what you want. 但是如果你的数据可以包含大数字,那么"...,1382,..."将被"...,1,..."取代"...,1,..."这可能不是你想要的。

Here are two solutions that do not have the above problem: 以下是两个没有上述问题的解决方案:

First, simply repeat the replacement until no changes occur anymore: 首先,只需重复更换,直到不再发生变化:

String oldString = str;
str = str.replace(",382,", ",");
while (!str.equals(oldString)) {
    oldString = str;
    str = str.replace(",382,", ",");
}

After that, you will have to handle possible occurrences at the end of the string. 之后,您将必须处理字符串末尾可能出现的事件。

Second, if you have Java 8, you can do a little more work yourself and use Java streams: 其次,如果你有Java 8,你可以自己做更多的工作并使用Java流:

str = Arrays.stream(str.split(","))
    .filter(s -> !s.equals("382"))
    .collect(Collectors.joining(","));

This first splits the string at ",", then filters out all strings which are equal to "382", and then concatenates the remaining strings again with "," in between. 这首先将字符串拆分为“,”,然后过滤掉所有等于“382”的字符串,然后将剩余的字符串再次与“,”之间连接起来。

(Both code snippets are untested.) (两个代码段都未经过测试。)

Traditional way:传统方式:

    String str = ",abc,null,null,0,0,7,8,9,10,11,12,13,14";
    String newStr = "", word = "";
    for (int i=0; i<str.length(); i++) {
        if (str.charAt(i) == ',') {
            if (word.equals("null") || word.equals("0"))
                word = "";
            newStr += word+",";
            word = "";
        } else {
            word += str.charAt(i);
            if (i == str.length()-1)
                newStr += word;
        }
    }
    System.out.println(newStr);

Output: ,abc,,,,,7,8,9,10,11,12,13,14输出:,abc,,,,,7,8,9,10,11,12,13,14

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM