简体   繁体   English

嵌套循环和数组(频率分析)

[英]Nested Loops and Arrays (Frequency Analysis)

I was wondering if I could get some help with with properly incrementing values to arrays. 我想知道是否可以通过正确地将值递增到数组来获得一些帮助。 The point of this program is to analyze the frequency of individual letters in a text file, and record them in an array. 该程序的目的是分析文本文件中单个字母的频率,并将它们记录在一个数组中。

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class FrequencyAnaylsis 
{
public static String[] alphabet = {"a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"};
public static int[] alphabetFrequency = new int[26];
public static int[] alphabetPercentage = new int[26];

FrequencyAnaylsis()
{
}

public static void getFileAndFrequency() throws FileNotFoundException
{
    File plaintext = new File("subplaintext");
    Scanner inFile = new Scanner(plaintext);

    for (int i = 0; i < 26; i++) //specifies the index of alphabet (the letter the program is looking for)
    {   
        while (inFile.hasNext()) //is true when the document has another word following the previous
        {
            String[] lettersToCompare = inFile.next().toLowerCase().split("(?!^)"); //splits the specified word into a String array

            for (int stringIndex = 0; stringIndex < lettersToCompare.length; stringIndex++) //loops through the index (individual letters) of the split word
            {
                if (lettersToCompare[stringIndex].equals(alphabet[i])) //if letter specified in split word equals letter specified in alphabet
                {
                    alphabetFrequency[i]++; //add one to the frequency array in the same index as the index in alphabet
                }
            }
        }   
    }
}

public static void getPercentage()
{
    int alphabetFrequencyTotal = 0;

    for (int i = 0; i < 26; i++)
    {
        alphabetFrequencyTotal =+ alphabetFrequency[i];
    }

    for (int i = 0; i < 26; i++)
    {
        alphabetPercentage[i] = alphabetFrequency[i]/alphabetFrequencyTotal;
    }
}

public static void printData()
{
    for (int i = 0; i < 26; i++)
    {
        System.out.println(alphabetFrequency[i]);
    }
}

public static void main(String[] args) throws FileNotFoundException
{
    FrequencyAnaylsis.getFileAndFrequency();
    //FrequencyAnaylsis.getPercentage();
    FrequencyAnaylsis.printData();

}
}

When the program reads this sentence: "Five score years ago, a great American, in whose symbolic shadow we stand, signed the Emancipation Proclamation.", it outputs the following: 当节目读到这句话:“五年前,一位伟大的美国人,我们站在其象征性的阴影中,签署了解放宣言。”,它输出如下:

12
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

It's able to count the characters correctly in the case of "a", but it doesn't do so for any other letter. 它可以在“a”的情况下正确计算字符数,但对于任何其他字母都不能。 Why is that? 这是为什么? Any help would be appreciated. 任何帮助,将不胜感激。

You problem is that you're trying to go through the file once for each letter of the English alphabet with this: 您的问题是,您尝试使用以下命令对英语字母表中的每个字母浏览一次该文件:

public static void getFileAndFrequency() throws FileNotFoundException
{
    File plaintext = new File("subplaintext");
    Scanner inFile = new Scanner(plaintext);

    for (int i = 0; i < 26; i++) //specifies the index of alphabet (the letter the program is looking for)
    {   
        while (inFile.hasNext()) //is true when the document has another word following the previous
        {
            String[] lettersToCompare = inFile.next().toLowerCase().split("(?!^)"); //splits the specified word into a String array

            for (int stringIndex = 0; stringIndex < lettersToCompare.length; stringIndex++) //loops through the index (individual letters) of the split word
            {
                if (lettersToCompare[stringIndex].equals(alphabet[i])) //if letter specified in split word equals letter specified in alphabet
                {
                    alphabetFrequency[i]++; //add one to the frequency array in the same index as the index in alphabet
                }
            }
        }   
    }
}

After the outer loop ( for (int i = 0; i < 26; i++) ) has run once, you're at the end of file, so all subsequent runs will act as though the file is empty. 外循环( for (int i = 0; i < 26; i++) )运行一次之后,您就位于文件末尾,因此所有后续运行都将像文件为空一样。

A simple fix is to change the order of the loops: 一个简单的解决方法是更改​​循环的顺序:

public static void getFileAndFrequency() throws FileNotFoundException
{
    File plaintext = new File("subplaintext");
    Scanner inFile = new Scanner(plaintext);

    while (inFile.hasNext()) //is true when the document has another word following the previous
    {   
        String[] lettersToCompare = inFile.next().toLowerCase().split("(?!^)"); //splits the specified word into a String array

        for (int i = 0; i < 26; i++) //specifies the index of alphabet (the letter the program is looking for)
        {
            for (int stringIndex = 0; stringIndex < lettersToCompare.length; stringIndex++) //loops through the index (individual letters) of the split word
            {
                if (lettersToCompare[stringIndex].equals(alphabet[i])) //if letter specified in split word equals letter specified in alphabet
                {
                    alphabetFrequency[i]++; //add one to the frequency array in the same index as the index in alphabet
                }
            }
        }   
    }
}

However, you're doing way too much in the inner loop. 但是,你在内循环中做得太多了。 As @Jägermeister points out, you'd be better off either using a Map (eg HashMap ) or utilize that you can simply assign the indices in your alphabetFrequency array directly: 正如@Jägermeister指出的那样,最好使用Map (例如HashMap ),也可以利用它可以直接在alphabetFrequency数组中直接分配索引:

public static void getFileAndFrequency() throws FileNotFoundException
{
    File plaintext = new File("subplaintext");
    Scanner inFile = new Scanner(plaintext);

    while (inFile.hasNext()) //is true when the document has another word following the previous
    {   
        String[] lettersToCompare = inFile.next().toLowerCase().split("(?!^)"); //splits the specified word into a String array

        for (int stringIndex = 0; stringIndex < lettersToCompare.length; stringIndex++) //loops through the index (individual letters) of the split word
        {
            char ch = lettersToCompare[stringIndex].charAt(0);
            if (ch >= 'a' && ch <= 'z')
                alphabetFrequency[ch-'a']++; //add one to the frequency array in the same index as the index in alphabet
        }   
    }
}

Example of using a Map : 使用Map示例:

public static Map<Char,Integer> getFileAndFrequency() throws FileNotFoundException
{
    Map<Char,Integer> frequencyMap = new HashMap<Char,Integer>();
    File plaintext = new File("subplaintext");
    Scanner inFile = new Scanner(plaintext);

    while (inFile.hasNext()) //is true when the document has another word following the previous
    {   
        String[] lettersToCompare = inFile.next().toLowerCase().split("(?!^)"); //splits the specified word into a String array

        for (int stringIndex = 0; stringIndex < lettersToCompare.length; stringIndex++) //loops through the index (individual letters) of the split word
        {
            char ch = lettersToCompare[stringIndex].charAt(0);
            Integer frequency = frequencyMap.get(ch);
            if (frequency ==null) {
               frequency = 0;
            }
            frequency += 1;
            frequencyMap.put(ch, frequency);
        }   
    }
    return frequencyMap;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM