[英]using stringtokenizer to count word issue
我無法按計划運行我的程序。 除了這一部分,我已經完成了一個星期,現在無法完成它。 該程序應計算每個單詞出現的次數。
輸入:
This is my file, yes my file My file.. ? ! , : ; / \ |" ^ * + = _( ) { } [ ] < >
輸出應如下所示:
file *3
is *1
my *3
this *1
yes *1
這是我的代碼
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.InputStreamReader;
import java.io.PrintStream;
import java.util.ArrayList;
import java.util.Collections;
public class cleanup3 {
public cleanup3() {}
public static void main(String[] args) {
try{
ArrayList myArraylist = new ArrayList();
System.out.println("Please Enter file");
InputStreamReader istream = new InputStreamReader(System.in) ;
BufferedReader bufRead = new BufferedReader(istream) ;
String fileName = bufRead.readLine();
BufferedReader file = new BufferedReader(new FileReader(fileName));
String s = null;
while((s = file.readLine()) != null) {
String updated2 = s.replaceAll("[\\.\\,\\?\\!\\:\\;\\/\\|\\\\\\^\\*\\+\\=\\_\\(\\)\\{\\}\\[\\]\\<\\>\"]+"," ");
//note to self: missing Single quotes (only if the LAST character of a token)
StringTokenizer st = new StringTokenizer(updated2.toLowerCase());
while (st.hasMoreTokens()) {
String nextToken = st.nextToken();
String myKeyValue = (String)myMap.get(nextToken);
if(myKeyValue == null){
myMap.put(nextToken, "1");
}
else{
int mycount = Integer.parseInt(myKeyValue) + 1;
myMap.put(nextToken, String.valueOf(mycount));
}
System.out.println(nextToken);
}
}
System.out.println( updated2.toLowerCase());
myArraylist.add(updated2.toLowerCase());
}
Collections.sort(myArraylist);
String outPutFileName = fileName + "sorted.txt";
PrintStream ps = new PrintStream( outPutFileName );
ps.print(myArraylist.toString());
ps.flush();
ps.close();
}
catch (Exception e){
System.out.println(e.toString());
}
}
您的代碼太復雜了-您只需要幾行代碼。
這是一種優雅的方法:
Map<String, Integer> map = new TreeMap<String, Integer>();
for (String word : input.toLowerCase().replaceAll("[^a-z ]", "").trim().split(" +"))
map.put(word, map.containsKey(word) ? map.get(word) + 1 : 1);
for (Map.Entry<String, Integer> entry : map.entrySet())
System.out.println(entry.getKey() + " *" + entry.getValue());
輸入:
然后,迭代映射條目以輸出總計。
使用TreeMap
可以免費按字母順序排序。
嘗試BufferedReader和Regex,如下所示:
Map<String, Integer> map = new HashMap<String, Integer>();
String line;
try (BufferedReader r = new BufferedReader(new FileReader(myFile))) {
Pattern pattern = Pattern.compile("[a-zA-Z]+");
while ((line = r.readLine())!=null) {
Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
String word = matcher.group();
map.put(word, map.get(word) == null ? 1 : map.get(word)+1);
}
}
}
System.out.println(map.toString());
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.