I have a paragraph text. I want to extract two or three sentences which contain keyword use regular expression in java
Example : paragraph: ....My name is Tom. I live with my family in the countryside. I love the animal. So I have a dog and a cat. However, we eat a lot......
keyword : a dog and a cat
Desired result : I love the animal. So I have a dog and a cat. However, we eat a lot
Note : I use Regular Expression in java.
String line = ".My name is Tom. I live with my family in the countryside. I love the animal. So I have a dog and a cat. However, we eat a lot...... "
String pattern = "a dog and a cat";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
boolean value= false;
if (m.find( )) {
System.out.println(m.toMatchResult());
System.out.println(m.groupCount());
System.out.println(m.group());
} else {
System.out.println("False");
}
Here's the pattern you want:
\.([^.]+\.[^.]*a dog and a cat[^.]*\.[^.]+)
Since you're in Java, remember to double up the backslashes when encoding it as a string.
Basically, what it'll do is match a literal dot, then any string of characters that isn't a dot (first sentence), another literal dot, the middle sentence containing your literal, then another sequence of characters that isn't a dot (the third sentence).
I made this class for one of my projects. Hope it helps.
import java.text.BreakIterator;
import java.util.ArrayList;
import java.util.List;
import java.util.Locale;
public class ExtractSentences {
private String paragraph;
private BreakIterator iterator;
private List<String> sentences;
public ExtractSentences(String paragraph) {
this.paragraph = paragraph;
sentences = new ArrayList();
this.extractSentences();
}
public void extractSentences() {
iterator = BreakIterator.getSentenceInstance(Locale.US);
iterator.setText(paragraph);
int lastIndex = iterator.first();
while (lastIndex != BreakIterator.DONE) {
int firstIndex = lastIndex;
lastIndex = iterator.next();
if (lastIndex != BreakIterator.DONE) {
String sentence = paragraph.substring(firstIndex, lastIndex);
sentences.add(sentence);
}
}
}
public String getParagraph() {
return paragraph;
}
public void setParagraph(String paragraph) {
this.paragraph = paragraph;
}
public void setSentences(List<String> sentences) {
this.sentences = sentences;
}
public List<String> getSentences() {
return sentences;
}
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.