I want to initialize stanfordNLP pipelince once and use it many times without initializing it again, to improve the execution time.
Is it possible?
I have code:
public static boolean isHeaderMatched(String string) {
// creates a StanfordCoreNLP object.
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner");
RedwoodConfiguration.current().clear().apply();
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Env env = TokenSequencePattern.getNewEnv();
env.setDefaultStringMatchFlags(NodePattern.CASE_INSENSITIVE);
env.setDefaultStringPatternFlags(Pattern.CASE_INSENSITIVE);
Annotation document = new Annotation(string);
// use the pipeline to annotate the document we created
pipeline.annotate(document);
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
CoreMapExpressionExtractor extractor = CoreMapExpressionExtractor.createExtractorFromFiles(env, "./app/utils/Summarizer/mapping/career_objective.rule", "./app/utils/Summarizer/mapping/personal_info.rule", "./app/utils/Summarizer/mapping/education.rule", "./app/utils/Summarizer/mapping/work_experience.rule", "./app/utils/Summarizer/mapping/certification.rule", "./app/utils/Summarizer/mapping/publication.rule", "./app/utils/Summarizer/mapping/award_achievement.rule", "./app/utils/Summarizer/mapping/hobbies_interest.rule", "./app/utils/Summarizer/mapping/lang_known.rule", "./app/utils/Summarizer/mapping/project_details.rule", "./app/utils/Summarizer/mapping/skill-set.rule", "./app/utils/Summarizer/mapping/misc_header.rule");
boolean flag = false;
for (CoreMap sentence : sentences) {
List<MatchedExpression> matched = extractor.extractExpressions(sentence);
//System.out.println("Probable Header is : " + matched);
Set<String> uniqueMatchedKeyWordSet = DocumentParserUtil.removeDuplicate(matched);
System.out.println("Matched: " + uniqueMatchedKeyWordSet + " and Size of MatchedSet: " + uniqueMatchedKeyWordSet.size());
//checked if the more than half the no. of word in header(string) is matched
if ((matched.size() >= uniqueMatchedKeyWordSet.size()) && !matched.isEmpty() && matched.size() >= Math.floorDiv(string.split("\\s").length, 2)) {
//System.out.println("This is sure a header!");
flag = true;
} else {
flag = false;
}
/*for(MatchedExpression phrase: matched){
System.out.println("matched header type: " + phrase.getValue().get());
}*/
}
return flag;
}
I want to execute this part of code to be executed only at first call of above method to load the model.
// creates a StanfordCoreNLP object.
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner");
RedwoodConfiguration.current().clear().apply();
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Env env = TokenSequencePattern.getNewEnv();
env.setDefaultStringMatchFlags(NodePattern.CASE_INSENSITIVE);
env.setDefaultStringPatternFlags(Pattern.CASE_INSENSITIVE);
Thanks in advance.
The following is an example of what you can do:
public class Example {
private static StanfordCoreNLP pipeline;
private static Env env;
static {
// creates a StanfordCoreNLP object.
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner");
RedwoodConfiguration.current().clear().apply();
pipeline = new StanfordCoreNLP(props);
env = TokenSequencePattern.getNewEnv();
env.setDefaultStringMatchFlags(NodePattern.CASE_INSENSITIVE);
env.setDefaultStringPatternFlags(Pattern.CASE_INSENSITIVE);
}
public static boolean isHeaderMatched(String string) {
Annotation document = new Annotation(string);
// use the pipeline to annotate the document we created
pipeline.annotate(document);
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
CoreMapExpressionExtractor extractor = CoreMapExpressionExtractor.createExtractorFromFiles(env, "./app/utils/Summarizer/mapping/career_objective.rule", "./app/utils/Summarizer/mapping/personal_info.rule", "./app/utils/Summarizer/mapping/education.rule", "./app/utils/Summarizer/mapping/work_experience.rule", "./app/utils/Summarizer/mapping/certification.rule", "./app/utils/Summarizer/mapping/publication.rule", "./app/utils/Summarizer/mapping/award_achievement.rule", "./app/utils/Summarizer/mapping/hobbies_interest.rule", "./app/utils/Summarizer/mapping/lang_known.rule", "./app/utils/Summarizer/mapping/project_details.rule", "./app/utils/Summarizer/mapping/skill-set.rule", "./app/utils/Summarizer/mapping/misc_header.rule");
boolean flag = false;
for (CoreMap sentence : sentences) {
List<MatchedExpression> matched = extractor.extractExpressions(sentence);
//System.out.println("Probable Header is : " + matched);
Set<String> uniqueMatchedKeyWordSet = DocumentParserUtil.removeDuplicate(matched);
System.out.println("Matched: " + uniqueMatchedKeyWordSet + " and Size of MatchedSet: " + uniqueMatchedKeyWordSet.size());
// checked if the more than half the no. of word in header(string) is matched
if ((matched.size() >= uniqueMatchedKeyWordSet.size()) && !matched.isEmpty() && matched.size() >= Math.floorDiv(string.split("\\s").length, 2)) {
flag = true;
} else {
flag = false;
}
}
return flag;
}
}
In the above code the static
block will be executed when the class is loaded. If you do not wish for this behavior then allow access to an init
method, like the following:
public class Example {
private static StanfordCoreNLP pipeline;
private static Env env;
public static init() {
// creates a StanfordCoreNLP object.
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner");
RedwoodConfiguration.current().clear().apply();
pipeline = new StanfordCoreNLP(props);
env = TokenSequencePattern.getNewEnv();
env.setDefaultStringMatchFlags(NodePattern.CASE_INSENSITIVE);
env.setDefaultStringPatternFlags(Pattern.CASE_INSENSITIVE);
}
public static boolean isHeaderMatched(String string) {
// code left out for brevity
}
}
Which can be called from another class using:
Example.init();
Example.isHeaderMatched("foobar");
While writing this answer I noticed a possible flaw in your logic. The following code may not produce the behavior you desire.
boolean flag = false;
for (CoreMap sentence : sentences) {
List<MatchedExpression> matched = extractor.extractExpressions(sentence);
//System.out.println("Probable Header is : " + matched);
Set<String> uniqueMatchedKeyWordSet = DocumentParserUtil.removeDuplicate(matched);
System.out.println("Matched: " + uniqueMatchedKeyWordSet + " and Size of MatchedSet: " + uniqueMatchedKeyWordSet.size());
// checked if the more than half the no. of word in header(string) is matched
if ((matched.size() >= uniqueMatchedKeyWordSet.size()) && !matched.isEmpty() && matched.size() >= Math.floorDiv(string.split("\\s").length, 2)) {
flag = true;
} else {
flag = false;
}
}
You're iterating over every CoreMap
in the List<CoreMap>
collection sentences
. Every iteration you set flag
to the result of the conditional, this is where the problem lies. The boolean flag
will only reflect the result of the last sentence
run through the conditional. If you need to know the result for each sentence
then you should have a list of booleans to keep track of the results, otherwise remove the loop and just check the last sentence (because that's what your loop is doing anyways).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.