简体   繁体   中英

Java Regex Remove Text Between and Including Parenthesis from String

I am programming in Java, and I have a few Strings that look similar to this:

"Avg. Price ($/lb)" 
"Average Price ($/kg)"

I want to remove the ($/lb) and ($/kg) from both Strings and be left with

"Avg. Price" 
"Average Price".

My code checks whether a String str variable matches one of the strings above, and if it does, replaces the text inside including the parentheses with an empty string:

    if(str.matches(".*\\(.+?\\)")){

           str = str.replaceFirst("\\(.+?\\)", "");
    }

When I change str.matches to str.contains("$/lb"); as a test, the wanted substring is removed which leads me to believe there is something wrong with the if statement. Any help as to what I am doing wrong? Thank you.

Update I changed the if statement to:

if(str.contains("(") && str.contains (")"))

Maybe not an elegant solution but it seems to work.

str.matches has always been problematic for me. I think it implies a '^' and '$' surrounding the regex you pass it.

Since you just care about replacing any occurrence of the string in question - try the following:

str = str.replaceAll("\\s+\\(\\$\\/(lb|kg)\\)", "");

There is an online regex testing tool that you can also try out to see how your expression works out.

EDIT With regard to your comment, the expression could be altered to just:

str = str.replaceAll("\\s+\\([^)]+\\)$", "");

This would mean, find any section of content starting with one or more white-space characters, followed by a literal '(', then look for any sequence of non-')' characters, followed by a literal ')' at the end of the line.

Is that more in-line with your expectation?

Additionally, heed the comment with regard to 'matches()' vs 'find()' that is very much so what is impacting operation here for you.

Unlike most other popular application languages, the matches() method in java only returns true if the regex matches the whole string (not part of the string like in perl, ruby, php, javascript etc).

The regex to match bracketed input, including any leading spaces, is:

" *\\(.*?\\)"

and the code to use this to remove matches is:

str = str.replaceAll(" *\\(.+?\\)", "");

Here's some test code:

String str = "foo (stuff) bar(whatever)";
str = str.replaceAll(" *\\(.+?\\)", "");
System.out.println(str);

Output:

"foo bar"

This code is working fine.

    String str = "Avg. Price ($/lb) Average Price ($/kg)";

    if (str.matches(".*\\(.+?\\)")) {
        str = str.replaceFirst("\\(.+?\\)", "");
    }
    System.out.println("str: "+str);

This will print Avg. Price Average Price which is what you need.

Note: I changed replaceFirst with replaceAll here.

String first = "^(\\\\w+\\\\.\\\\s\\\\w+)";

This would print out Avg. Price

String second="(\\\\w\\\\s\\\\w)";

This would print out Average Price

hope this simple answer helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM