檢查文件中單詞列表的最有效方法

Question

我剛完成一項作業，希望我將所有Java關鍵字添加到HashSet中。 然后讀入一個.java文件，並計算任何關鍵字出現在.java文件中的次數。

我采取的方法是：創建一個包含所有關鍵字的String []數組。 創建一個HashSet，並使用Collections.addAll將數組添加到HashSet中。 然后，當我遍歷文本文件時，將通過HashSet.contains（currentWordFromFile）;對其進行檢查。

有人建議使用HashTable執行此操作。 然后我看到了一個使用TreeSet的類似示例。 我只是好奇..推薦這樣做的方法是什么？

（在此處完成代碼： http : //pastebin.com/GdDmCWj0 ）

Answer 1

嘗試使用Map<String, Integer> ，其中String是單詞，而Integer是出現該單詞的次數。

這樣的好處之一是您不需要處理文件兩次。

Answer 2

您說“有家庭作業”，所以我假設您已經完成了。

我會做一些不同的事情。 首先，我認為您的String數組中的某些關鍵字不正確。 根據Wikipedia和Oracle的說法，Java有50個關鍵字。 無論如何，我已經很好地注釋了我的代碼。 這是我想出的...

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Map;
import java.util.HashMap;

public class CountKeywords {

    public static void main(String args[]) {

        String[] theKeywords = { "abstract", "assert", "boolean", "break", "byte", "case", "catch", "char", "class", "const", "continue", "default", "do", "double", "else", "enum", "extends", "false", "final", "finally", "float", "for", "goto", "if", "implements", "import", "instanceof", "int", "interface", "long", "native", "new", "null", "package", "private", "protected", "public", "return", "short", "static", "strictfp", "super", "switch", "synchronized", "this", "throw", "throws", "transient", "true", "try", "void", "volatile", "while" };

        // put each keyword in the map with value 0 
        Map<String, Integer> theKeywordCount = new HashMap<String, Integer>();
        for (String str : theKeywords) {
            theKeywordCount.put(str, 0);
        }

        FileReader fr;
        BufferedReader br;
        File file = new File(args[0]);

        // attempt to open and read file
        try {
            fr = new FileReader(file);
            br = new BufferedReader(fr);

            String sLine;

            // read lines until reaching the end of the file
            while ((sLine = br.readLine()) != null) {

                // if an empty line was read
                if (sLine.length() != 0) {

                    // extract the words from the current line in the file
                    if (theKeywordCount.containsKey(sLine)) {
                        theKeywordCount.put(sLine, theKeywordCount.get(sLine) + 1);
                    }
                }
            }

        } catch (FileNotFoundException exception) {
            // Unable to find file.
            exception.printStackTrace();
        } catch (IOException exception) {
            // Unable to read line.
            exception.printStackTrace();
        } finally {
                br.close();
            }

        // count how many times each keyword was encontered
        int occurrences = 0;
        for (Integer i : theKeywordCount.values()) {
            occurrences += i;
        }

        System.out.println("\n\nTotal occurences in file: " + occurrences);
    }
}

每次遇到文件中的關鍵字時，我都會先檢查它是否在Map中； 如果不是，則它不是有效的關鍵字； 如果是，那么我將更新與關鍵字關聯的值，即，將關聯的Integer遞增1，因為我們再次看到了此關鍵字。

或者，您可以擺脫最后一個for循環，而只需保持運行計數，那么您將擁有...

if (theKeywordCount.containsKey(sLine)) {
    occurrences++;
}

...，然后在最后打印出計數器。

我不知道這是否是最有效的方法，但我認為這是一個堅實的開端。

如果您有任何疑問，請告訴我。 我希望這有幫助。
斯托伊奇

檢查文件中單詞列表的最有效方法

問題描述

2 個解決方案

解決方案1
2 已采納 2011-04-27 05:22:15

解決方案2
1 2011-04-27 06:12:47

檢查文件中單詞列表的最有效方法

問題描述

2 個解決方案

解決方案1 2 已采納 2011-04-27 05:22:15

解決方案2 1 2011-04-27 06:12:47

解決方案1
2 已采納 2011-04-27 05:22:15

解決方案2
1 2011-04-27 06:12:47