计算文本文件中的字母 - 频率

Question

I have a given text file with letters a, b, .., z with their given occurence.我有一个给定的文本文件，其中包含给定的字母 a、b、..、z。 I wrote it like this我是这样写的


"letter";"occurences"
a;105
b;29
...
z;0

I have to use this data to create a vector "freq" of length 26 containing the frequency of occurrence of each of the 26 letters from a to z.我必须使用这些数据创建一个长度为 26 的向量“freq”，其中包含从 a 到 z 的 26 个字母中的每一个的出现频率。


def letterFrequency(small_text):
    filein = open("small_text.txt", "r") # Opens the file for reading
    lines = filein.readlines() # Reads all lines into an array
    smalltxt = "".join(lines) # Joins the lines into one big string
    freq = 0
    n = 1296
    for letter in lines:
        np.count_nonzero(letter)
        freq.append(letter)
        freq = letter/n
     return freq
print(letterFrequency('small_text.txt'))

The total number of n = 1296 which is relevant for the frequency which is given in %, expected output is therefore因此，n = 1296 的总数与以 % 给出的频率相关，预期为 output

[ 8.10185185 2.23765432 2.4691358 4.55246914
12.34567901
2.00617284 1.92901235 6.71296296 7.17592593
0.07716049
1.15740741 3.39506173 1.08024691 6.71296296
7.87037037
1.46604938 0.07716049 6.01851852 5.40123457
10.95679012
2.85493827 0.92592593 2.93209877 0.
1.54320988 0. ]

Since 105/1296 = 0.081因为 105/1296 = 0.081

If anyone would want to help me and navigate me further thank you since my code isn't working!如果有人想帮助我并进一步引导我，谢谢你，因为我的代码不起作用！

Answer 1

you need to create a list to store the values and append into this list您需要创建一个列表来将值和 append 存储到该列表中

Also instead of hardcoding 1296 you should get the accumulated frequency and then divide by this.另外，不要硬编码 1296，您应该获得累积频率，然后除以它。

def letterFrequency(filename):
    frequencies = []
    letters = []
    accum = 0
    with open(filename, 'r') as fin:
        for line in fin:
            letter, freq = line.split(';')
            try: 
                freq = int(freq)
            except ValueError:       # to handle the first line
                continue
            accum += freq
            letters.append(letter)
            frequencies.append(freq)

        # normalize frequencies
        frequencies = [i/accum for i in frequencies]

    # you need to keep a list of letters, otherwise how do you
    # know to which letter does each frequency belong?
    return letters, frequencies

计算文本文件中的字母 - 频率

问题描述

1 个解决方案

解决方案1
0

计算文本文件中的字母 - 频率

问题描述

1 个解决方案

解决方案1 0

解决方案1
0