简体   繁体   中英

Find some text in a string on matching to a reg-ex

I am looking for sentences of the form. "....X is educated at Y..." in third field of each line of a document of text. X is known and Y is the unknown. On a successful match, how can I get the value of Y? Following is my code:

    Pattern p1 = Pattern.compile(".* educated at .*");
    int count = 0;

    while((line = br.readLine()) != null){
        String datavalue[] = line.split("\t");
        String text = datavalue[2];
        Matcher m = p1.matcher(text);
        if(m.matches()){
            count++;
            //System.out.println(text);
            //How do I get Y?

        }
    }

I'm new to reg-ex. Any help is appreciated.

Capture the found text as a group:

Pattern p1 = Pattern.compile(".* educated at (.*)");//note the parenthesis
int count = 0;

while((line = br.readLine()) != null){
    String datavalue[] = line.split("\t");
    String text = datavalue[2];
    Matcher m = p1.matcher(text);
    if(m.matches()){
        count++;
        System.out.println(m.group(1));

    }
}

Please see https://docs.oracle.com/javase/tutorial/essential/regex/groups.html for more information

You can do it in one line:

while((line = br.readLine()) != null){
    String y = line.replaceAll(".*?\t.*?\t{^\t]*educated at (\\w+).*|.*", "$1");

The variable y will be blank if there's no match.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM