简体   繁体   English

在python中打印文件中每个字符的统计信息

[英]Print statistics for every character in a file in python

What I am trying to do is to take a file's data and print out the percentage of each character in file but I don't want to use duplicates. 我想做的是获取文件的数据并打印出文件中每个字符的百分比,但是我不想使用重复项。 I need to print just one character with a relevant percentage point. 我只需要打印一个具有相关百分比的字符。 Below is the snippet. 以下是代码段。

for all_char in text:
    char_counter = 0 
    if count_char(text, all_char) > 1:
        perc1 = 100 / len(text) * count_char(text, all_char)
        print("{0} - {1}%".format(all_char, round(perc1, 2)))
        with open(filename, "w") as w:        #<-------- I need a code to remove a single character
            w.truncate(char_counter)
            char_counter += 1

    elif count_char(text, all_char) == 1:
        perc2 = 100 * count_char(text, all_char) / len(text)
        print("{0} - {1}%".format(all_char, round(perc2, 2)))
        char_counter += 1

Above I made a variable called char_counter which will be increased after every iteration and the function called count_char will tell, how many times each character is used in file and if that number is higher than 1 the character must be deleted from the file means that it will print only once. 在上面我创建了一个名为char_counter的变量,该变量将在每次迭代后增加,而称为count_char的函数将告诉您,文件中每个字符使用了多少次,如果该数字大于1,则必须从文件中删除该字符,这意味着它将只打印一次。 This is the base idea but the code gives me an error. 这是基本思想,但是代码给了我一个错误。

You can get the character counts of the entire file by using a Counter over the characters. 您可以通过在字符上使用Counter来获取整个文件的字符数。 Then the percentage of each character is count for that character/total count . 然后,将每个字符的百分比count for that character/total count

from collections import Counter
from itertools import chain

with open(filename) as f:
    counts = Counter(chain.from_iterable(f))

total = sum(counts.values())

for character, count in counts.items():
    print('{:<2} - {:>6.2f}%'.format(repr(character)[1:-1], (count/total) * 100))

For the text 对于文字

Mary had a little lamb.

This prints 此打印

M  -   4.17%
a  -  16.67%
r  -   4.17%
y  -   4.17%
   -  16.67%
h  -   4.17%
d  -   4.17%
l  -  12.50%
i  -   4.17%
t  -   8.33%
e  -   4.17%
m  -   4.17%
b  -   4.17%
.  -   4.17%
\n -   4.17%

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM