I am looking for sentences of the form. "....X is educated at Y..." in third field of each line of a document of text. X is known and Y is the unknown. On a successful match, how can I get the value of Y? Following is my code:
Pattern p1 = Pattern.compile(".* educated at .*");
int count = 0;
while((line = br.readLine()) != null){
String datavalue[] = line.split("\t");
String text = datavalue[2];
Matcher m = p1.matcher(text);
if(m.matches()){
count++;
//System.out.println(text);
//How do I get Y?
}
}
I'm new to reg-ex. Any help is appreciated.
Capture the found text as a group:
Pattern p1 = Pattern.compile(".* educated at (.*)");//note the parenthesis
int count = 0;
while((line = br.readLine()) != null){
String datavalue[] = line.split("\t");
String text = datavalue[2];
Matcher m = p1.matcher(text);
if(m.matches()){
count++;
System.out.println(m.group(1));
}
}
Please see https://docs.oracle.com/javase/tutorial/essential/regex/groups.html for more information
You can do it in one line:
while((line = br.readLine()) != null){
String y = line.replaceAll(".*?\t.*?\t{^\t]*educated at (\\w+).*|.*", "$1");
The variable y
will be blank if there's no match.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.