![](/img/trans.png)
[英]How to read alphabets (a-z || A-Z) and digits (0-9) characters from a text file only in Java?
[英]Relative Frequency Count of letters (a-z) from a text file in Java
我的文本輸入文件已經處理過,僅包含字母(az)和空格。 由於某些原因,當我輸入一個非常大的文本文件(大約40萬個單詞文件,該文件是通過剪切並粘貼到MSWord中確定的)時,相對頻率計數將失敗。 但是對於較小的文件,它可以工作,例如,總字符數= 36。 請有人能告訴我代碼在哪里出問題了嗎?
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class SoloCount {
public static void main(String[] args) throws IOException {
String inputFile = "sampleOutput.txt";
// My array for the a-z (97-122, based on ASCII table )
try {
int[] myArray = new int[26];
BufferedReader readerObject = new BufferedReader(new FileReader(inputFile));
String sCurrentLine="";
sCurrentLine = readerObject.readLine();
for(int i = 0; i<sCurrentLine.length(); i++) // for each character in the readline from the input file, a-z will be counted.
{
if (Character.isLetter(sCurrentLine.charAt(i)) == true) // qualifies characterisa letter and not an empty space.
{
char singleLetter = sCurrentLine.charAt(i);
myArray[(int)(singleLetter)-97] = myArray[(int)(singleLetter)-97] + 1; // Assigning frequency of a character. 97-122 represents a-z (ASCII table). e.g lowercase c = 97
}
}
readerObject.close();
//Calculate the total number of characters from the input file.
double sumOfCharacters= 0;
for (int i = 0; i < myArray.length; i++)
{
sumOfCharacters += myArray[i];
}
System.out.println("The total number of characters in this file is: " + sumOfCharacters);
//Calculating the realtive frequency. Divide each occurrence for each letter (a-z) by the sumOfCharacters.
System.out.printf("%10s%6s%n", "Letter", "%"); //column labels "Letter" and "%"
System.out.println();
for (int i = 0; i < myArray.length; i++)
{
char singleLetter = (char)(i + 97); //converting the decimal ASCII annotation to letters for a-z
double value = myArray[i];
System.out.printf("%8s%13f%n",singleLetter,(value/sumOfCharacters)*100);
}
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
您僅讀取文件的一行-可能是該行上只有36個字符,或者在36個字符之后有一個換行符。
您還可以通過傳入更大的初始緩沖區大小來增加BufferedReader的緩沖區大小-
BufferedReader readerObject = new BufferedReader(new FileReader(inputFile), 2048);
有關更多詳細信息,請參見此處 。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.