繁体   English   中英

从字符串中删除停用词

[英]Removing stop words from String

class MyClass {
public static void remove_stopwords(String[] query, String[] stopwords) {
    A: for (int i = 0; i < query.length; i++) {
        B: for (int j = 0; j < stopwords.length; j++) {
             C: if (query[i].equals(stopwords[j])) { 
                    break B;
                } 
                else {
                    System.out.println(query[i]);
                    break B;
                }
            }
        } 
    }
}

由于某种原因,此代码只能在问题的一半左右正常工作。 它从查询中取出第一个停用词,但忽略其余词。 任何帮助,将不胜感激。

 class MyClass 
 {
    public static void remove_stopwords(String[] query, String[] stopwords) {

        A: for (int i = 0; i < query.length; i++) {
            //iterate through all stopwords
            B: for (int j = 0; j < stopwords.length; j++) {
                    //if stopwords found break
                    C: if (query[i].equals(stopwords[j])) { 
                        break B;
                    } 
                    else { 
                        // if this is the last stopword print it
                        // it means query[i] does not equals with all stopwords
                        if(j==stopwords.length-1)
                        {
                           System.out.println(query[i]);
                        }
                    }
                }
            } 
        }
    }

我尝试在arraylist中添加停用词,并尝试与stringarray进行比较以删除是否发现了停用词。 但是我在循环中发现了一些问题。

public static void main(String[] args) {
        ArrayList<String> stopWords = new ArrayList<String>();
        stopWords.add("that");
        stopWords.add("at");
        String sentence = "I am not that good at coder";
        String[] SentSplit = sentence.split(" ");
        System.out.println(SentSplit.length);
        StringBuffer finalSentence = new StringBuffer();
        boolean b = false;

        for(int i=0; i<stopWords.size();i++){
            String stopWord = stopWords.get(i);
            for(int j = 0; j<SentSplit.length;j++){
                String word = SentSplit[j];
                if(!stopWord.equalsIgnoreCase(word)){
                    finalSentence.append(SentSplit[j] + " ");
                }
            }
        }
        System.out.println(finalSentence);
    }

预期结果是: I am not good coder

但是我的结果是: I am not good at coder I am not that good coder

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM