简体   繁体   English

仅在某些字符分隔的文本的某些部分中替换子字符串

[英]Replace substring only in certain portions of text delimited by some character

I would need to replace all occurrences of a substring, only if it is preceded by "]" and followed by "[" (preceeded and followed but not necessarily next to the substring). 我将需要替换所有出现的子字符串,只要它以“]”开头,然后是“ [”(在子字符串之前和之后,但不一定在子字符串旁边)即可。 Example: 例:

This would be the string where I need to do the substitutions: 这将是我需要进行替换的字符串:

[style and tags info] valid text info [more style info] more info here[styles]

If the expression to replace was: info -> change (it may be more than a single word) 如果要替换的表达式是:info-> change(它可能不止一个单词)

The result should be: 结果应为:

[style and tags info] valid text change [more style info] more change here [styles]

My idea was to use a regex to isolate the words I have to change and then make the replacement with a call to replaceAll. 我的想法是使用正则表达式隔离需要更改的单词,然后通过调用replaceAll进行替换。

But I have tried several regexs to isolate the search expression without success. 但是我尝试了几种正则表达式来隔离搜索表达式,但没有成功。 Mainly because I would need something like 主要是因为我需要这样的东西

(?<=.*)

this is, a lookbehind with arbitrary number of characters before the word I am looking for. 这是在我要搜索的单词之前有任意数量字符的后向搜索。 And this is not supported by Java regex (nor any other implementation of regex that I know). Java正则表达式(也不是我所知的任何其他正则表达式实现)均不支持此功能。

I have found this solution, written in matlab, but it seems harder to replicate in Java: 我已经找到了用matlab编写的解决方案,但似乎很难用Java复制:

Matlab regex - replace substring ONLY within angled brackets Matlab正则表达式-仅在尖括号内替换子字符串

Is there a simpler approach? 有没有更简单的方法? Some regex I have not considered? 我没有考虑过一些正则表达式?

I'd say the easiest way here is to split the string into (parts outside the brackets) and (parts inside the brackets), and then only apply the replacements to (parts inside the brackets). 我想说的最简单的方法是将字符串分成(括号内的部分)和(括号内的部分),然后仅将替换内容应用于(括号内的部分)。

For example, you can do this using split (this assumes that your [] s are evenly balanced, you're not opening two [[ , etc): 例如,您可以使用split进行此操作(这假设您的[]是均匀平衡的,您没有打开两个[[等):

String[] parts = str.split("[\[\]]");
StringBuilder sb = new StringBuilder(str.length());
for (int i = 0; i < parts.length; i++) {
  if (i % 2 == 0) {
    // This bit was outside [].
    sb.append(parts[i]);
  } else {
    // This bit was inside [], so apply the replacement
    // (and re-append the delimiters).
    sb.append("[");
    sb.append(parts[i].replace("info", "change"));
    sb.append("]");
  }
}
String newStr = sb.toString();

It seems more appropriate to match and skip the substrings that start with [ , then have 1 or more chars other than [ and ] up to the closing ] , and replace info with change in all other contexts. 似乎更合适的方法是匹配并跳过以[开头的子字符串,然后在[]之前具有1个或多个字符,直到结束] ,并在所有其他上下文中将info替换为change For this purpose, you may use Matcher#appendReplacement() method: 为此,可以使用Matcher#appendReplacement()方法:

String s = "[style and tags info] valid text info [more style info] more info here[styles]";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("\\[[^\\]\\[]+]|\\b(info)\\b").matcher(s);
while (m.find()) {
    if (m.group(1) != null) {
        m.appendReplacement(result, "change");
    }
    else {
        m.appendReplacement(result, m.group());
    }
}
m.appendTail(result);
System.out.println(result.toString());
// => [style and tags info] valid text change [more style info] more change here[styles]

See the Java demo 参见Java演示

The \\[[^\\]\\[]+]|\\b(info)\\b regex matches those [...] substrings with \\[[^\\]\\[]+] alternative branch and \\b(info)\\b branch (Group 1) captures the whole word info . \\[[^\\]\\[]+]|\\b(info)\\b正则表达式使用\\[[^\\]\\[]+]替代分支和\\b(info)\\b匹配那些[...]子字符串\\b(info)\\b分支(组1)捕获整个单词info If Group 1 matches, the replacement occurs, else, the matched [...] substring is inserted back into the result. 如果第1组相匹配,更换时,否则,匹配[...]串插回结果。

As for your original logic, yes, you can use a "simple" .replaceAll with the (?:\\G|(?<=]))([^\\]\\[]*?)\\binfo\\b regex (with $1change replacement), but I doubt it is what you need. 至于您的原始逻辑,是的,您可以使用带有(?:\\G|(?<=]))([^\\]\\[]*?)\\binfo\\b regex的“ simple” .replaceAll $1change替换),但我怀疑这是您所需要的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM