簡體   English   中英

如何使用weka.core.Stopwords Java類從字符串的每一行中刪除所有停用詞

[英]How to remove all stopwords from each line of string using the weka.core.Stopwords java class

我有這樣的格式的字符串

 String wordTyp = "i love to bake you a good sandwitch \\n" + "and i love biscuit and you? \\n"; 

如何使用Java中的weka.core.Stopwords刪除字符串行中的每個停用詞?

 public String removeStopWords(String word,int OriginCount){ Scanner scanner = new Scanner(word); StringBuilder wordDocNoStopWord = new StringBuilder(); String lineOfText =""; int lineCount = 0; Stopwords checker = new Stopwords(); while (scanner.hasNextLine() && lineCount < OriginCount){ lineOfText = scanner.nextLine() + " \\n"; if(checker.is(lineOfText)){/// confirms a stopword in here checker.clear(); ///and clears any stopwords in that line } lineCount++; wordDocNoStopWord.append(new StringBuilder(lineOfText)); System.out.printf(lineOfText); } scanner.close(); return wordDocNoStopWord.toString(); } 

您可以這樣做:(我無權使用編譯器,因此可能需要小的修復)

public String removeStopWords(String word,int OriginCount){
String delim = " ";
List<String> list = new ArrayList<String>(Arrays.asList(word.split(delim)));

Stopwords checker = new Stopwords();

for(int i=0; i< list.size(); i++){
        c = list.get(i);
        temp = c.getText();

        if(checker.is(temp)){
            list.remove(i);
            i--;                
        }       
}

String listAsString = "";

for (String temp : list)
{
    listAsString += temp + " ";
}
    return listAsString;
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM