简体   繁体   English

使用HashMaps的字数统计程序

[英]Word Count Program using HashMaps

import java.io.*;
import java.util.*;

public class ListSetMap2 
{
    public static void main(String[] args)
    {
        Map<String, Integer> my_collection = new HashMap<String, Integer>();
        Scanner keyboard = new Scanner(System.in);

        System.out.println("Enter a file name");
        String filenameString = keyboard.nextLine();
        File filename = new File(filenameString);
        int word_position = 1;
        int word_num = 1;

        try
        {
            Scanner data_store = new Scanner(filename);
            System.out.println("Opening " + filenameString);
            while(data_store.hasNext())
            {
                String word = data_store.next();
                if(word.length() > 5)
                {
                    if(my_collection.containsKey(word))
                    {
                        my_collection.get(my_collection.containsKey(word));
                        Integer p = (Integer) my_collection.get(word_num++);
                        my_collection.put(word, p);
                    }
                    else
                    {
                        Integer i = (Integer) my_collection.get(word_num);
                        my_collection.put(word, i);
                    }
                }
            }
        }
        catch (FileNotFoundException e)
        {
            System.out.println("Nope!");
        }
    }
}

I'm trying to write a program where it inputs/scans a file, logs the words in a HashMap collection, and count's the times that word occurs in the document, with only words over 5 characters being counted. 我正在尝试编写一个程序,在该程序中输入/扫描文件,将单词记录在HashMap集合中,并计算单词在文档中出现的时间,仅对超过5个字符的单词进行计数。

It's a bit of a mess in the middle, but I'm running into issues on how to count the number of times that word occurs, and keeping a individual count for each word. 中间有点混乱,但是我遇到了一个问题,即如何计算单词出现的次数,并为每个单词单独计数。 I'm sure there is a simple solution here and I'm just missing it. 我敢肯定这里有一个简单的解决方案,我只是想念它。 Please help! 请帮忙!

Your logic of setting the frequency of word is wrong. 您设定字词频率的逻辑是错误的。 Here is a simple approach that should work for you: 这是一种适合您的简单方法:

    // if the word is already present in the hashmap
    if (my_collection.containsKey(word)) {
        // just increment the current frequency of the word
        // this overrides the existing frequency
        my_collection.put(word, my_collection.get(word) + 1);
    } else {
        // since the word is not there just put it with a frequency 1
        my_collection.put(word, 1);
    }

(Only giving hints, since this seems to be homework.) my_collection is (correctly) a HashMap that maps String keys to Integer values; (仅提供提示,因为这似乎是家庭作业。) my_collection是(正确地)将String键映射到Integer值的HashMap in your situation, a key is supposed to be a word, and the corresponding value is supposed to be the number of times you have seen that word (frequency). 在您的情况下,键应该是一个单词,相应的值应该是您看到该单词的次数(频率)。 Each time you call my_collection.get(x) , the parameter x needs to be a String , namely the word whose frequency you want to know (unfortunately, HashMap doesn't enforce this). 每次调用my_collection.get(x) ,参数x都必须是String ,即您想知道其频率的单词(不幸的是, HashMap并没有强制执行此操作)。 Each time you call my_collection.put(x, y) , x needs to be a String , and y needs to be an Integer or int , namely the frequency for that word. 每次调用my_collection.put(x, y)x必须是一个String ,而y需要是一个Integerint ,即该单词的频率。

Given this, give some more thought to what you're using as parameters, and the sequence in which you need to make the calls and how you need to manipulate the values. 鉴于此,请进一步考虑用作参数的内容,以及进行调用的顺序以及如何操纵值。 For example, if you've already determined that my_collection doesn't contain the word, does it make sense to ask my_collection for the word's frequency? 例如,如果您已经确定my_collection不包含单词,那么向my_collection询问单词的频率是否有意义? If it does contain the word, how do you need to change the frequency before putting the new value into my_collection ? 如果它确实包含单词,那么在将新值放入my_collection之前,您需要如何更改频率?

(Also, please choose a more descriptive name for my_collection , eg frequencies .) (另外,请为my_collection选择一个更具描述性的名称,例如frequencies 。)

Try this way - 尝试这种方式-

while(data_store.hasNext()) {

                String word = data_store.next();

                   if(word.length() > 5){

                    if(my_collection.get(word)==null) my_collection.put(1);
                    else{
                       my_collection.put(my_collection.get(word)+1);
                    }

                }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM