[英]How to remove all stopwords from each line of string using the weka.core.Stopwords java class
I have a string in such format 我有这样的格式的字符串
String wordTyp = "i love to bake you a good sandwitch \\n" + "and i love biscuit and you? \\n";
How would I remove every stop words in the line of the strings, using weka.core.Stopwords in java? 如何使用Java中的weka.core.Stopwords删除字符串行中的每个停用词?
public String removeStopWords(String word,int OriginCount){ Scanner scanner = new Scanner(word); StringBuilder wordDocNoStopWord = new StringBuilder(); String lineOfText =""; int lineCount = 0; Stopwords checker = new Stopwords(); while (scanner.hasNextLine() && lineCount < OriginCount){ lineOfText = scanner.nextLine() + " \\n"; if(checker.is(lineOfText)){/// confirms a stopword in here checker.clear(); ///and clears any stopwords in that line } lineCount++; wordDocNoStopWord.append(new StringBuilder(lineOfText)); System.out.printf(lineOfText); } scanner.close(); return wordDocNoStopWord.toString(); }
Can you do this: (I dont have access to a compiler so it may require minor fixes) 您可以这样做:(我无权使用编译器,因此可能需要小的修复)
public String removeStopWords(String word,int OriginCount){
String delim = " ";
List<String> list = new ArrayList<String>(Arrays.asList(word.split(delim)));
Stopwords checker = new Stopwords();
for(int i=0; i< list.size(); i++){
c = list.get(i);
temp = c.getText();
if(checker.is(temp)){
list.remove(i);
i--;
}
}
String listAsString = "";
for (String temp : list)
{
listAsString += temp + " ";
}
return listAsString;
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.