简体   繁体   English

Java-将单词从.txt文件放入HashMap?

[英]Java - Putting words from .txt file into HashMap?

As the title states, I am attempting to read a simple text file and submitting the individual words into a hash map. 如标题所示,我试图读取一个简单的文本文件,并将各个单词提交到哈希图中。 I will eventually be constructing my program to count the frequency each word, which HashMaps I have the following text file (text.txt): 我最终将构建我的程序以计算每个单词的频率,我在HashMaps中拥有以下文本文件(text.txt):

it was the best of times 
it was the worst of times

it was the age of wisdom 
it was the age of foolishness

it was the epoch of belief 
it was the epoch of incredulity

it was the season of light 
it was the season of darkness

it was the spring of hope 
it was the winter of despair
see the test
try this one

I have written the following c 我写了以下c

import java.util.*; 
import java.io.*; 

public class Profile{

  public static String file;
  public static int len;
  public static int count = 0;
  public static String[] words;
  public static String[] unrepeatedWords;

  public static Map<String, Integer> record = new HashMap<String, Integer>();
  //Integer count = record.get(word);
  //Integer count = record.get(word);
  Set<String> keySet = record.keySet(); 



//Method to read whole file
  static void wholeFile(File file){
    try {
            Scanner in = new Scanner(file);
            int lineNumber = 1;

            while(in.hasNextLine()){



              String line = in.nextLine();
              //count += new StringTokenizer(line, " ,").countTokens();
              //System.out.println(line);
              words = line.split("/t");
              words = line.split(" ");
              //System.out.println(words + "");
              lineNumber++;
            }
           for(String word : words){
             //System.out.println(word);
             if(!record.containsKey(word)){ record.put(word, 1); }
             if(record.containsKey(word)){ record.put(word, record.get(word) + 1); }
           }
           System.out.println(record);
           in.close();

        } catch (Exception ex) {
            ex.printStackTrace();
        }

  }

  Profile(String file){
    this.file = file;
  }
  Profile(String file, int len){
    this.file = file;
    this.len = len;
  }
  public static void main(String[] args){
      file = args[0] + "";
      File a = new File(file);
      //Scanner in = new Scanner(a);

      wholeFile(a);  
  }
}

However, when I run the command run Profile text.txt , I am only storing the last line into the HashMap: 但是,当我运行命令run Profile text.txt时,我仅将最后一行存储到HashMap中:

> run Profile text.txt
{one=2, this=2, try=2}
> 

What am I doing incorrectly? 我做错了什么? How do I efficiently store all words inside of a .txt file inside of a HashMap? 如何有效地将所有单词存储在HashMap中的.txt文件中? Any advice will be helpful. 任何建议都会有所帮助。

As other answers have stated, you missplaced your for that handles the split . 正如其他答案所指出的那样,您错位了您for ,无法for split It should be inside the while , like so: 它应该在while ,如下所示:

while (in.hasNextLine()) {
    String line = in.nextLine();
    words = line.split(" ");

    //here so it can use the split from the previous line
    for (String word : words) {
        if (!record.containsKey(word)) {
            record.put(word, 1);
        }
        else {
            record.put(word, record.get(word) + 1);
        }
    }
}

Note that you were also doing two consecutive splits which doesn't make any sense. 请注意,您还进行了两个连续的拆分,这没有任何意义。

You should consider storing your data as a .json file, format it to the standard json format. 您应该考虑将数据存储为.json文件,并将其格式化为标准json格式。 then parse your data 然后解析您的数据

You need to put the for loop that is putting the words into the hash map inside the while loop. 您需要将for循环放入将单词放入while循环内的哈希映射中。 As it is you loop over all lines and then process the last. 因为它是循环所有行,然后处理最后一行。

Wow, you're making this complicated. 哇,你让这个变得复杂了。

  1. Investigate the Java String split method. 研究Java String split方法。

  2. Think about your hash map. 考虑一下您的哈希图。 For counting, you only want one entry for each unique word. 为了进行计数,每个唯一单词只需要一个条目。 So in pseudocode, you want something like: 因此,在伪代码中,您需要类似:

    open file for each line in file do for each word in line do if not map.containsKey(word) map.put(word, 1) else -- increment your count here fi od od do something with the results 为文件中的每一行打开文件对每一行中的每个单词执行操作如果没有map.containsKey(word)map.put(word,1)否则-在此处增加计数,以便对结果进行某些处理

Suddenly SO won't format that as code. 突然之间,SO不会将其格式化为代码。

这是屏幕截图:

Updated to use String.split. 更新为使用String.split。 Damn whippersnappers. 该死的wh子。

put for(String word : words) loop inside while (in.hasNextLine()) loop while (in.hasNextLine())循环中放入for(String word : words) while (in.hasNextLine())循环

instead of split(" ") better to use split("\\\\s+") because its free text format. 最好使用split("\\\\s+")而不是split(" ") ,因为它是自由文本格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM