Python：用於計算字符串中唯一字母的字典

Question

我正在使用Python 3中的String庫來解決這個HarvardX挑戰，但我認為我的解決方案並不是很好。 你能看到更整潔的解決方案嗎？

這是我的代碼：

#writing the 2 strings

alpha = string.ascii_letters

alpha
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'

sent = 'She sells seashells on the seashore the seashells she sells are      seashells for sure'
sent
'She sells seashells on the seashore the seashells she sells are seashells for sure'

#WRITING DICT to lookup count alpha string characters within 'She sells(etc)'

mydict_countalpha = {alpha[0]:sent.count(alpha[0]), alpha[1]:sent.count(alpha[1]), alpha[2]:sent.count(alpha[2]), alpha[3]:sent.count(alpha[3]), alpha[4]:sent.count(alpha[4]), alpha[5]:sent.count(alpha[5])}

#result:
mydict_countalpha
{'a': 5, 'b': 0, 'c': 0, 'd': 0, 'e': 16, 'f': 1}

好極了。 這是正確的。

但問題是

alpha字符串長度為52個字符 。 如果我手動逐行編寫這本詞典，我想我會犯錯誤。 我怎么能做得更好？ 它與迭代有關嗎？

為什么我問

這是基於優秀的HarvardX課程“ 使用Python進行研究 ”的家庭作業。 根據HarvardX的指導，我們對它進行了評估，但是咨詢Stack Overflow可以解決問題。 :-)如果你有任何想法，我不會因為詢問而作弊。

我認為這個挑戰有很廣泛的應用，希望你也覺得它很有趣。 但是，我是一個初學者程序員，使用Python進行陡峭的學習。 不過謝謝你的任何建議！

最好

一種

Answer 1

Pythonic方法是通過字典理解使用collections.Counter和ascii_letters過濾器鍵。 為了提高效率，您可以ascii_letters轉換為set ：

from collections import Counter
from string import ascii_letters

letters_set = set(ascii_letters)

res = {k: v for k, v in Counter(sent).items() if k in letters_set}

print(res)

{'S': 1, 'h': 8, 'e': 16, 's': 17, 'l': 10, 'a': 5,
 'o': 3, 'n': 1, 't': 2, 'r': 4, 'f': 1, 'u': 1}

該解決方案具有O（ m + n ）復雜度，而您當前的解決方案具有復雜度O（ m * n ）。 您可以通過理解str.count （如list.count ）具有O（n）復雜性來理解這str.count ，即字典理解中的每次迭代都需要完整解析字符串。

Answer 2

簡單地查看sent中的每個字母並且每次增加該字母的計數似乎要容易得多。

my_dict = {}

for lett in sent:
    if lett in my_dict:
        my_dict[lett] += 1
    else:
        # first entry
        my_dict[lett] = 1

或者更簡單地說，使用dict.setdefault ：

for lett in sent:
    my_dict.setdefault(lett, 0) += 1

但請注意，stdlib模塊collections有一個名為Counter的對象，它正是這樣做的。

from collections import Counter

my_dict = Counter(sent)

您可以在使用filter計算之前進一步過濾掉不需要的字母

alpha = set(string.ascii_letters)

filtered = filter(lambda ch: ch in alpha, sent)

my_dict = Counter(filtered)

Answer 3

使用字典理解：

mydict_countalpha = {c:sent.count(c) for c in alpha}

但是使用Counter對象會更有效，因為當前解決方案是O(n^2)而Counter對象是創建的O(n)復雜度，然后我們可以過濾掉不在alpha字符串中的那些。

from collections import Counter
mydict_countalpha = {k:v for k,v in Counter(sent).items() if k in alpha}

Answer 4

你可以使用詞典理解

mydict_countalpha = {alpha[x]:sent.count(alpha[x]) for x in range(len(alpha))}

但是沒有必要繼續查找索引。 直接在alpha循環

mydict_countalpha = {ch:sent.count(ch) for ch in alpha}

然而，我通常會這樣做的方法是使用collections.Counter

from collections import Counter
mydict_countalpha = {k: v for k, v in Counter(sent).items() if k in alpha}

編輯：添加循環版本

mydict_countalpha = {}
for ch in alpha:
    mydict_countalpha[ch] = sent.count(ch)

Answer 5

來自其他HarvardX學生的Edx討論論壇上有很多評論，他們嘗試了不同的方法（包括for循環或理解）來編寫正確的答案，但仍然無法獲得積分。 同樣在這里！

以下是初學者應根據本課程使用的方法。 我在這里略微調整，以便任何瀏覽這個的學生仍然必須編寫自己的代碼來通過...

sentenceA = 'I could not collect points on this homework and that is sad'
alphabet_string = string.ascii_letters
count_lett_dict = {}
for letters in sentenceA:
    if letters in alphabet_string:
        if letters in count_lett_dict:
            count_lett_dict[letters] += 1
        else:
            count_lett_dict[letters] = 1
count_lett_dict

Python：用於計算字符串中唯一字母的字典

問題描述

但問題是

為什么我問

5 個解決方案

解決方案1
3 2018-08-23 17:50:51

解決方案2
2 2018-08-23 17:52:25

解決方案3
1 2018-08-23 17:50:38

解決方案4
1 已采納 2018-08-23 17:52:35

解決方案5
0 2018-08-24 12:20:11

Python：用於計算字符串中唯一字母的字典

問題描述

但問題是

為什么我問

5 個解決方案

解決方案1 3 2018-08-23 17:50:51

解決方案2 2 2018-08-23 17:52:25

解決方案3 1 2018-08-23 17:50:38

解決方案4 1 已采納 2018-08-23 17:52:35

解決方案5 0 2018-08-24 12:20:11

解決方案1
3 2018-08-23 17:50:51

解決方案2
2 2018-08-23 17:52:25

解決方案3
1 2018-08-23 17:50:38

解決方案4
1 已采納 2018-08-23 17:52:35

解決方案5
0 2018-08-24 12:20:11