Store occurences of words in a file and their count,using Scanner.( Java )

Question

Here's the code:

        Scanner scan = new Scanner(new FileReader ("C:\\mytext.txt"));
        HashMap<String, Integer> listOfWords = new HashMap<String, Integer>();

        while(scan.hasNextLine())
        {
            Scanner innerScan = new Scanner(scan.nextLine());
            boolean wordExistence ;
            while(wordExistence = innerScan.hasNext())
            {
                String word = innerScan.next(); 
                int countWord = 0;
                if(!listOfWords.containsKey(word)){ already
                    listOfWords.put(word, 1); 
                }else{
                    countWord = listOfWords.get(word) + 1; 
                    listOfWords.remove(word);
                    listOfWords.put(word, countWord); 
                }
            }
        }

        System.out.println(listOfWords.toString());

The problem is, my output contains words like :

document.Because=1 document.This=1 space.=1

How do I handle this full stop's that are occuring?(And for further issues, I think any sentence terminator would be an issue, like question mark or exclamation mark).

Answer 1

查看Scanner API的类说明，特别是有关使用除空格之外的定界符的段落。

Answer 2

Scanner uses any whitespace as the default delimiter. You can call useDelimiter() of the Scanner instance and specify your own regexp to be used as delimiter.

Answer 3

If you want your input to be split not only using white space delimiter, but also . and question/exclamation mark, you will have to define a Pattern and then apply it to your Scanner using useDelimiter ( doc ).

Answer 4

Maybe you want to tinker with the following answer for speed optimization.

    final Pattern WORD = Pattern.compile("\\w+");
    while(scan.hasNextLine())
    {
        Scanner innerScan = new Scanner(scan.nextLine());
        while(innerScan.hasNext(WORD))
        {
            String word = innerScan.next(WORD); 
            if(!listOfWords.containsKey(word)){
                listOfWords.put(word, 1); 
            }else{
                int countWord = listOfWords.get(word) + 1; 
                //listOfWords.remove(word);
                listOfWords.put(word, countWord); 
            }
        }
    }

Store occurences of words in a file and their count,using Scanner.( Java )

Question

4 answers

solution1
2 ACCPTED 2012-03-14 17:33:11

solution2
2 2012-03-14 17:35:21

solution3
1 2012-03-14 17:36:22

solution4
1 2012-03-14 17:49:55

Store occurences of words in a file and their count,using Scanner.( Java )

Question

4 answers

solution1 2 ACCPTED 2012-03-14 17:33:11

solution2 2 2012-03-14 17:35:21

solution3 1 2012-03-14 17:36:22

solution4 1 2012-03-14 17:49:55

solution1
2 ACCPTED 2012-03-14 17:33:11

solution2
2 2012-03-14 17:35:21

solution3
1 2012-03-14 17:36:22

solution4
1 2012-03-14 17:49:55