简体   繁体   English

正则表达式使用Java String.replaceAll

[英]Regex using Java String.replaceAll

I am looking to replace a java string value as follows. 我想要替换如下的java字符串值。 below code is not working. 下面的代码不起作用。

        cleanInst.replaceAll("[<i>]", "");
        cleanInst.replaceAll("[</i>]", "");
        cleanInst.replaceAll("[//]", "/");
        cleanInst.replaceAll("[\bPhysics Dept.\b]", "Physics Department");
        cleanInst.replaceAll("[\b/n\b]", ";");
        cleanInst.replaceAll("[\bDEPT\b]", "The Department");
        cleanInst.replaceAll("[\bDEPT.\b]", "The Department");
        cleanInst.replaceAll("[\bThe Dept.\b]", "The Department");
        cleanInst.replaceAll("[\bthe dept.\b]", "The Department");
        cleanInst.replaceAll("[\bThe Dept\b]", "The Department");
        cleanInst.replaceAll("[\bthe dept\b]", "The Department");
        cleanInst.replaceAll("[\bDept.\b]", "The Department");
        cleanInst.replaceAll("[\bdept.\b]", "The Department");
        cleanInst.replaceAll("[\bdept\b]", "The Department");

What is the easiest way to achieve the above replace? 实现上述替换的最简单方法是什么?

If it is a function that continuously you are using, there is a problem. 如果它是您正在使用的功能,则存在问题。 Each regular expression is compiled again for each call. 每次调用都会再次编译每个正则表达式。 It is best to create them as constants. 最好将它们创建为常量。 You could have something like this. 你可以有这样的东西。

private static final Pattern[] patterns = {
    Pattern.compile("</?i>"),
    Pattern.compile("//"),
    // Others
};

private static final String[] replacements = {
    "",
    "/",
    // Others
};

public static String cleanString(String str) {
    for (int i = 0; i < patterns.length; i++) {
        str = patterns[i].matcher(str).replaceAll(replacements[i]);
    }
    return str;
}
cleanInst.replaceAll("[<i>]", "");

should be: 应该:

cleanInst = cleanInst.replaceAll("[<i>]", "");

since String class is immutable and doesn't change its internal state, ie replaceAll() returns a new instance that's different from cleanInst . 因为String类是不可变的并且不会改变其内部状态,即replaceAll()返回一个与cleanInst不同的新实例。

You should read a basic regular expressions tutorial . 您应该阅读基本的正则表达式教程

Until then, what you tried to do can be done like this: 在那之前,你试图做的事情可以这样做:

cleanInst = cleanInst.replace("//", "/");
cleanInst = cleanInst.replaceAll("</?i>", "");
cleanInst = cleanInst.replaceAll("/n\\b", ";")
cleanInst = cleanInst.replaceAll("\\bPhysics Dept\\.", "Physics Department");
cleanInst = cleanInst.replaceAll("(?i)\\b(?:the )?dept\\b\\.?", "The Department");

You could probably chain all those replace operations (but I don't know the proper Java syntax for this). 您可以链接所有这些替换操作(但我不知道适当的Java语法)。

About the word boundaries : \\b usually only makes sense directly before or after an alphanumeric character. 关于单词边界\\b通常只在字母数字字符之前或之后才有意义。

For example, \\b/n\\b will only match /n if it's directly preceded by an alphanumeric character and followed by a non-alphanumeric character, so it matches "a/n!" 例如, \\b/n\\b只会匹配/n如果它直接前面有一个字母数字字符,后跟一个非字母数字字符,那么它匹配"a/n!" but not "foo /n bar" . 但不是"foo /n bar"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM