简体   繁体   中英

How to check whether a word starts with either space or underscore or it is the beginning of the line/string in java?

I have to check a string for a specific word. Condition is that: it starts and ends with either space or underscore or it is start or end of string. Case is insensitive.

Following is my code:

package example;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
    public static void main(String[] args) {
        // TODO Auto-generated method stub
        String test = "The starting of  week";
        String t = "[^_ ]the[ _$]";
        Pattern p = Pattern.compile(t,Pattern.CASE_INSENSITIVE);
        Matcher matcher = p.matcher(test);
        if( matcher.find() ){
            System.out.println(true);  

        }
    }
}

What am I doing wrong?

The problem is that [^abc] is the syntax to match anything except characters 'a', 'b', or 'c'. You will need to change your pattern to something like:

String t = "(^|[_ ])the[ _$]";

Note that escaping the '^' character doesn't work:

String t = "[\\^_ ]the[ _$]";

As that would be interpreted as the literal character '^', and not the beginning of the input.

EDIT: by the way, the same problem exists with the '$' character, so you will need:

String t = "(^|[_ ])the([ _]|$)";

Use

(?mi)(?<=^|[_ ])the(?=[_ ]|$)

See proof . If you want to handle any kind of whitespace, replace the space in the character class with a \s .

Note the (?mi) part, m enables line awareness mode for the anchors, and i makes matching case insensitive.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM