簡體   English   中英

計算段落中每個單詞的字母數 python

[英]counting the number of letters in each word of a paragraph python

如果我需要編寫一個 function 來讀取一個大段落並打印出段落中每個長度有多少個單詞,我該怎么做?

這是我到目前為止所嘗試的。

#split the piece of writing into a list so I can search over every word
#need to find out how to make this not take so long

piece = "hello world"
words = piece.split()

#search over every word, have a variable for each number, check to see length of word and add correspondingly

for word in range(words):
    one = 0
    two = 0
    three = 0
    four = 0
    five = 0
    six = 0
    seven = 0
    eight = 0
    nine = 0
    ten = 0
    eleven = 0
    twelve = 0
    thirteen = 0
    other = 0
    total = 0
    if (len(word) == 1):
        one += 1
        total += 1
    elif (len(word) == 2):
        two += 1
        total += 1
    elif (len(word) == 3):
        three += 1
        total += 1
    elif (len(word) == 4):
        four += 1
        total += 1
    elif (len(word) == 5):
        five += 1
        total += 1
    elif (len(word) == 6):
        six += 1
        total += 1
    elif (len(word) == 7):
        seven += 1
        total += 1
    elif (len(word) == 8):
        eight += 1
        total += 1
    elif (len(word) == 9):
        nine += 1
        total += 1
    elif (len(word) == 10):
        ten += 1
        total += 1
    elif (len(word) == 11):
        eleven += 1
        total += 1
    elif (len(word) == 12):
        twelve += 1
        total += 1
    elif (len(word) == 13):
        thirteen += 1
        total += 1
    else:
        other += 1
        total += 1

#print results
print(f'Proportion of 1- letter words: {one / total * 100}% {one} words')
print(f'Proportion of 2- letter words: {two / total* 100}% {two} words')
print(f'Proportion of 3- letter words: {three / total* 100}% {three} words')
print(f'Proportion of 4- letter words: {four / total * 100}% {four} words')
print(f'Proportion of 5- letter words: {five / total * 100}% {five} words')
print(f'Proportion of 6- letter words: {six / total * 100}% {six} words')
print(f'Proportion of 7- letter words: {seven/ total * 100}% {seven} words')
print(f'Proportion of 8- letter words: {eight / total * 100}% {eight} words')
print(f'Proportion of 9- letter words: {nine / total * 100}% {nine} words')
print(f'Proportion of 10- letter words: {ten / total * 100}% {ten} words')
print(f'Proportion of 11- letter words: {eleven / total * 100}% {eleven} words')
print(f'Proportion of 12- letter words: {twelve / total * 100}% {twelve} words')
print(f'Proportion of 13- letter words: {thirteen / total * 100}% {thirteen} words')

我認為這兩個問題是我不知道如何讓循環在段落的整個長度內運行,而且我不知道如何編寫代碼以便不需要大量文本永遠奔跑。

  • 盡量避免重復代碼。 例如,使用字典(例如stats )而不是多個變量更容易,在每個單詞上增加其記錄( stats[len(word)] +=
  • Python 中包含大量電池,可以大大減少您編寫的代碼量。 在這種情況下, defaultdictCounter可能會有所幫助。

應用這些后,你會得到類似的東西:

from collections import Counter

stats = Counter(len(word) for word in paragraph.split())
total_words = sum(stats.values())

for length in sorted(stats.keys()):
    print("proportions of %d words: %f" % (length, stats[length] / total_words))

UPD:旁注:迭代字典時,Python 僅使用鍵。 來自字典的Counter子類,因此它具有相同的行為。 因此,為簡潔起見,可以僅for length in sorted(stats): ,但對於不熟悉此 Python 功能的人來說,它可能看起來不直觀。 stats.keys()導致相同的結果,但更明確。

您好,代碼統計所有字母、逗號等。

fname = input('Enter the file name: ')
try:
    f=open(fname)
except:
    print('The file can not be opened')
t=[]
for line in f:
    s=line.split()
    t.extend(s)
h=dict()
for i in t:
    for j in I:
        if j in h:
            h[j]+=1
        else:
            h[j]=1
print(h)
 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM