i want to count word frequency from multiple files.
Moreover, i have these words in these files
a1.txt = {aaa, aaa, aaa}
a2.txt = {aaa}
a3.txt = {aaa, bbb}
so, the results must be aaa = 3, bbb = 1.
Then, i have define the above data structures,
LinkedHashMap<String, Integer> wordCount = new LinkedHashMap<String, Integer>();
Map<String, LinkedHashMap<String, Integer>>
fileToWordCount = new HashMap<String,LinkedHashMap<String, Integer>>();
and then, i read the words from files and put them in wordCount and fileToWordCount:
/*lineWords[i] is a word from a line in the file*/
if(wordCount.containsKey(lineWords[i])){
System.out.println("1111111::"+lineWords[i]);
wordCount.put(lineWords[i], wordCount.
get(lineWords[i]).intValue()+1);
}else{
System.out.println("222222::"+lineWords[i]);
wordCount.put(lineWords[i], 1);
}
fileToWordCount.put(filename, wordCount); //here we map filename
and occurences of words
and finally, i print the fileToWordCount with the above code,
Collection a;
Set filenameset;
filenameset = fileToWordCount.keySet();
a = fileToWordCount.values();
for(Object filenameFromMap: filenameset){
System.out.println("FILENAMEFROMAP::"+filenameFromMap);
System.out.println("VALUES::"+a);
}
and prints,
FILENAMEFROMAP::a3.txt
VALUES::[{aaa=5, bbb=1}, {aaa=5, bbb=1}, {aaa=5, bbb=1}]
FILENAMEFROMAP::a1.txt
VALUES::[{aaa=5, bbb=1}, {aaa=5, bbb=1}, {aaa=5, bbb=1}]
FILENAMEFROMAP::a2.txt
VALUES::[{aaa=5, bbb=1}, {aaa=5, bbb=1}, {aaa=5, bbb=1}]
So, how i can use the map fileToWordCount to find word frequency in the files?
You're making it harder than necessary. Here's how I would do it:
Map<String, Counter> wordCounts = new HashMap<String, Counter>();
for (File file : files) {
Set<String> wordsInFile = new HashSet<String>(); // to avoid counting the same word in the same file twice
for (String word : readWordsFromFile(file)) {
if (!wordsInFile.contains(word)) {
wordsInFile.add(word);
Counter counter = wordCounts.get(word);
if (counter == null) {
counter = new Counter();
wordCounts.put(word, counter);
}
counter.increment();
}
}
}
If I may suggest another approach :)
use a Map<String, Set<String>> map
.
foreach file f in files
foreach word w in f
if w in map.keys()
map[w].add(f)
else
initialize map w to be a set with the only element file
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.