简体   繁体   English

如何创建正则表达式来替换已知字符串,同时保持可选参数不变?

[英]How do I create a regular expression to replace a known string while keeping optional parameters intact?

I've been trying out now for a while, but don't get it right: In Java, I am trying to create a regular expression to match and replace a (to me known) string out of a string while keeping optional parameters intact. 我已经尝试了一段时间了,但是做得不好:在Java中,我试图创建一个正则表达式来匹配并替换一个字符串(据我所知),同时保持可选参数完整。

Example inputs: 输入示例:

{067e6162-3b6f-4ae2-a171-2470b63dff00}
{067e6162-3b6f-4ae2-a171-2470b63dff00,number}
{067e6162-3b6f-4ae2-a171-2470b63dff00,number,integer}
{067e6162-3b6f-4ae2-a171-2470b63dff00,choice,1#one more item|1<another {067e6162-3b6f-4ae2-a171-2470b63dff00,number,integer} items}

(Note that the last example contains a nested reference to the same input string). (请注意,最后一个示例包含对同一输入字符串的嵌套引用)。 The format is always enclosing the to-be-replaced string in curly brackets {...} but with an optional list of comma-separated parameter(s). 该格式始终将要替换的字符串括在大括号{...}但带有可选的逗号分隔参数列表。

I want to replace the input string with a number, eg for above input strings the result should be: 我想用数字替换输入字符串,例如,对于上述输入字符串,结果应为:

{2}
{2,number}
{2,number,integer}
{2,choice,1#one more item|1<another {2,number,integer} items}

Ideally, I'd like to have a regex that is flexible enough to handle (almost) any string as pattern to be replaced, so not just UUID kind of strings as above but also something like this: 理想情况下,我想拥有一个足够灵活的正则表达式来处理(几乎)任何字符串作为要替换的模式,因此不仅是上述的UUID类型的字符串,而且还需要这样的东西:

A test string with {the_known_input_value_to_be_replaced,number,integer} not replacing the_known_input_value_to_be_replaced if its not in curly brackets of course.

which should end up as eg: 应该以如下形式结束:

A test string with {3,number,integer} not replacing the_known_input_value_to_be_replaced if its not in curly brackets of course.

Note that the substitution should only take place if the input string is in curly brackets. 请注意,仅当输入字符串在大括号中时才应进行替换。 In Java I will be able to construct the pattern at runtime, taking the to-be-replaced string into account verbosely. 在Java中,我将能够在运行时构造模式,并详细考虑待替换的字符串。

I tried eg \\{(067e6162-3b6f-4ae2-a171-2470b63dff00)(,?.*)\\} (not java escaped yet) and more generic approaches like \\{(+?)(,?.*)\\} , but they all don't do it right. 我尝试了例如\\{(067e6162-3b6f-4ae2-a171-2470b63dff00)(,?.*)\\} (尚未转义Java)和更通用的方法,例如\\{(+?)(,?.*)\\} ,但是他们都做错了。

Any advice from regex ninjas highly appreciated :) regex忍者的任何建议都值得赞赏:)

If you the known old string always occurs right after { you can just use 如果您知道的旧字符串总是在{之后出现,则可以使用

String result = old_text.replace("{" + my_old_keyword, "{" + my_new_keyword);

If you really have multiple known strings inside curly brackets (and there are no escaped curly brackets to take care of), you can use the following code: 如果在花括号中确实有多个已知字符串(并且没有转义的花括号需要照顾),则可以使用以下代码:

String input = "067e6162-3b6f-4ae2-a171-2470b63dff00 is outside {067e6162-3b6f-4ae2-a171-2470b63dff00,choice,067e6162-3b6f-4ae2-a171-2470b63dff00,1#one more item|1<another {067e6162-3b6f-4ae2-a171-2470b63dff00,number,067e6162-3b6f-4ae2-a171-2470b63dff00,integer} items} 067e6162-3b6f-4ae2-a171-2470b63dff00 is outside ";
String old_key = "067e6162-3b6f-4ae2-a171-2470b63dff00";
String new_key = "NEW_KEY";
List<String> chunks = replaceInBalancedSubstrings(input, '{', '}', old_key, new_key);
System.out.println(String.join("", chunks));

Result: 067e6162-3b6f-4ae2-a171-2470b63dff00 is outside {{NEW_KEY,choice,NEW_KEY,1#one more item|1<another {NEW_KEY,number,NEW_KEY,integer} items} 067e6162-3b6f-4ae2-a171-2470b63dff00 is outside 结果: 067e6162-3b6f-4ae2-a171-2470b63dff00 is outside {{NEW_KEY,choice,NEW_KEY,1#one more item|1<another {NEW_KEY,number,NEW_KEY,integer} items} 067e6162-3b6f-4ae2-a171-2470b63dff00 is outside

The replaceInBalancedSubstrings method will look like: replaceInBalancedSubstrings方法将如下所示:

public static List<String> replaceInBalancedSubstrings(String s, Character markStart, Character markEnd, String old_key, String new_key) {
    List<String> subTreeList = new ArrayList<String>();
    int level = 0;
    int prevStart = 0;
    StringBuffer sb = new StringBuffer();
    int lastOpenBracket = -1;
    for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        if (level == 0) {
            sb.append(c);
        }
        if (c == markStart) {
            level++;
            if (level == 1) {
                lastOpenBracket = i;
                if (sb.length() > 0) {
                    subTreeList.add(sb.toString());
                    sb.delete(0, sb.length());
                }
            }
        }
        else if (c == markEnd) {
            if (level == 1) {
                subTreeList.add(s.substring(lastOpenBracket, i+1).replace(old_key, new_key)); // String replacement here
            }
            level--;
        }
    }
    if (sb.length() > 0) {
        subTreeList.add(sb.toString());
    }
    return subTreeList;
}

See IDEONE demo IDEONE演示

This code will deal with replacements only inside substrings inside balanced (nested) curly braces. 此代码仅在平衡(嵌套)花括号内的子字符串内处理替换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM