簡體   English   中英

從字符串列表中查找出現

[英]Find occurences from a list of strings

我想創建一個沒有外部庫的 function 從單詞(字符串)列表中查找字母並僅在單詞超過 3 個字符時計算它們的出現然后按順序打印它們。

用詞列出

word_list = ['THE', 'ZEN', 'OF', 'PYTHON', 'BY', 'TIM', 'PETERS', 'BEAUTIFUL', 'IS', 'BETTER', 'THAN', 'UGLY', 'EXPLICIT', 'IS', 'BETTER', 'THAN', 'IMPLICIT', 'SIMPLE', 'IS', 'BETTER', 'THAN', 'COMPLEX', 'COMPLEX', 'IS', 'BETTER', 'THAN', 'COMPLICATED', 'FLAT', 'IS', 'BETTER', 'THAN', 'NESTED', 'SPARSE', 'IS', 'BETTER', 'THAN', 'DENSE', 'READABILITY', 'COUNTS', 'SPECIAL', 'CASES', 'ARENT', 'SPECIAL', 'ENOUGH', 'TO', 'BREAK', 'THE', 'RULES', 'ALTHOUGH', 'PRACTICALITY', 'BEATS', 'PURITY', 'ERRORS', 'SHOULD', 'NEVER', 'PASS', 'SILENTLY', 'UNLESS', 'EXPLICITLY', 'SILENCED', 'IN', 'THE', 'FACE', 'OF', 'AMBIGUITY', 'REFUSE', 'THE', 'TEMPTATION', 'TO', 'GUESS', 'THERE', 'SHOULD', 'BE', 'ONE', 'AND', 'PREFERABLY', 'ONLY', 'ONE', 'OBVIOUS', 'WAY', 'TO', 'DO', 'IT', 'ALTHOUGH', 'THAT', 'WAY', 'MAY', 'NOT', 'BE', 'OBVIOUS', 'AT', 'FIRST', 'UNLESS', 'YOURE', 'DUTCH', 'NOW', 'IS', 'BETTER', 'THAN', 'NEVER', 'ALTHOUGH', 'NEVER', 'IS', 'OFTEN', 'BETTER', 'THAN', 'RIGHT', 'NOW', 'IF', 'THE', 'IMPLEMENTATION', 'IS', 'HARD', 'TO', 'EXPLAIN', 'ITS', 'A', 'BAD', 'IDEA', 'IF', 'THE', 'IMPLEMENTATION', 'IS', 'EASY', 'TO', 'EXPLAIN', 'IT', 'MAY', 'BE', 'A', 'GOOD', 'IDEA', 'NAMESPACES', 'ARE', 'ONE', 'HONKING', 'GREAT', 'IDEA', '', 'LETS', 'DO', 'MORE', 'OF', 'THOSE']

所需的 output:

Words with more than 3 letters

1 BETTER shows up 8 times
2 THAN shows up 7 times
.
.
.

您可以使用:

word_list.count("BETTER")

output:

8

或者:

from collections import Counter
Counter(word_list)

output:

Counter({'IS': 10, 'BETTER': 8, 'THAN': 8, 'THE': 6, 'TO': 5, 'OF': 3, 'ALTHOUGH': 3, 'NEVER': 3, 'BE': 3, 'ONE': 3, 'IDEA': 3, 'COMPLEX': 2, 'SPECIAL': 2, 'SHOULD': 2, 'UNLESS': 2, 'OBVIOUS': 2, 'WAY': 2, 'DO': 2, 'IT': 2, 'MAY': 2, 'NOW': 2, 'IF': 2, 'IMPLEMENTATION': 2, 'EXPLAIN': 2, 'A': 2, 'ZEN': 1, 'PYTHON': 1, 'BY': 1, 'TIM': 1, 'PETERS': 1, 'BEAUTIFUL': 1, 'UGLY': 1, 'EXPLICIT': 1, 'IMPLICIT': 1, 'SIMPLE': 1, 'COMPLICATED': 1, 'FLAT': 1, 'NESTED': 1, 'SPARSE': 1, 'DENSE': 1, 'READABILITY': 1, 'COUNTS': 1, 'CASES': 1, 'ARENT': 1, 'ENOUGH': 1, 'BREAK': 1, 'RULES': 1, 'PRACTICALITY': 1, 'BEATS': 1, 'PURITY': 1, 'ERRORS': 1, 'PASS': 1, 'SILENTLY': 1, 'EXPLICITLY': 1, 'SILENCED': 1, 'IN': 1, 'FACE': 1, 'AMBIGUITY': 1, 'REFUSE': 1, 'TEMPTATION': 1, 'GUESS': 1, 'THERE': 1, 'AND': 1, 'PREFERABLY': 1, 'ONLY': 1, 'THAT': 1, 'NOT': 1, 'AT': 1, 'FIRST': 1, 'YOURE': 1, 'DUTCH': 1, 'OFTEN': 1, 'RIGHT': 1, 'HARD': 1, 'ITS': 1, 'BAD': 1, 'EASY': 1, 'GOOD': 1, 'NAMESPACES': 1, 'ARE': 1, 'HONKING': 1, 'GREAT': 1, '': 1, 'LETS': 1, 'MORE': 1, 'THOSE': 1})

使用內置 python 函數的簡單方法:

keys = set(word_list)

values = [word_list.count(key) for key in keys]

for k, v in zip(keys, values):
    print('item', k, 'has count', v)

Output:

item EASY has count 1
item IS has count 10
item DENSE has count 1
item EXPLICITLY has count 1
item FIRST has count 1
item THE has count 6
item DUTCH has count 1
item ONE has count 3
item BEAUTIFUL has count 1
item TO has count 5
item LETS has count 1
item BREAK has count 1
item READABILITY has count 1
item THAT has count 1
item GREAT has count 1
item IF has count 2
item NOW has count 2
item GOOD has count 1
item ALTHOUGH has count 3
item WAY has count 2
item MORE has count 1
item NESTED has count 1
item SPARSE has count 1
item AND has count 1
item ERRORS has count 1
item ZEN has count 1
item BY has count 1
item SILENCED has count 1
item ITS has count 1
item BETTER has count 8
item OBVIOUS has count 2
item ONLY has count 1
item THOSE has count 1
item ARENT has count 1
item REFUSE has count 1
item EXPLICIT has count 1
item BAD has count 1
item COMPLEX has count 2
item SILENTLY has count 1
item BE has count 3
item COMPLICATED has count 1
item PETERS has count 1
item SHOULD has count 2
item PREFERABLY has count 1
item UNLESS has count 2
item RULES has count 1
item NAMESPACES has count 1
item THERE has count 1
item OF has count 3
item EXPLAIN has count 2
item IMPLEMENTATION has count 2
item HARD has count 1
item IN has count 1
item COUNTS has count 1
item NOT has count 1
item A has count 2
item YOURE has count 1
item PURITY has count 1
item NEVER has count 3
item IMPLICIT has count 1
item DO has count 2
item ARE has count 1
item BEATS has count 1
item HONKING has count 1
item AMBIGUITY has count 1
item PRACTICALITY has count 1
item RIGHT has count 1
item ENOUGH has count 1
item MAY has count 2
item UGLY has count 1
item SIMPLE has count 1
item TIM has count 1
item IT has count 2
item CASES has count 1
item FLAT has count 1
item FACE has count 1
item THAN has count 8
item AT has count 1
item TEMPTATION has count 1
item PYTHON has count 1
item SPECIAL has count 2
item PASS has count 1
item IDEA has count 3
item OFTEN has count 1
item GUESS has count 1

它似乎找到了一種方法來計算列表中唯一值的數量。 盡管您可以使用set() (保存唯一值)和len()函數的組合,如本答案中所述,但您可以使用:

#print(len(set(word_list)))
#86

counts_unique_values = dict(zip(list(word_list),[list(word_list).count(i) for i in list(word_list)])) 
print(counts_unique_values)

output:

{'THE': 6, 'ZEN': 1, 'OF': 3, 'PYTHON': 1, 'BY': 1, 'TIM': 1, 'PETERS': 1, 'BEAUTIFUL': 1, 'IS': 10, 'BETTER': 8, 'THAN': 8, 'UGLY': 1, 'EXPLICIT': 1, 'IMPLICIT': 1, 'SIMPLE': 1, 'COMPLEX': 2, 'COMPLICATED': 1, 'FLAT': 1, 'NESTED': 1, 'SPARSE': 1, 'DENSE': 1, 'READABILITY': 1, 'COUNTS': 1, 'SPECIAL': 2, 'CASES': 1, 'ARENT': 1, 'ENOUGH': 1, 'TO': 5, 'BREAK': 1, 'RULES': 1, 'ALTHOUGH': 3, 'PRACTICALITY': 1, 'BEATS': 1, 'PURITY': 1, 'ERRORS': 1, 'SHOULD': 2, 'NEVER': 3, 'PASS': 1, 'SILENTLY': 1, 'UNLESS': 2, 'EXPLICITLY': 1, 'SILENCED': 1, 'IN': 1, 'FACE': 1, 'AMBIGUITY': 1, 'REFUSE': 1, 'TEMPTATION': 1, 'GUESS': 1, 'THERE': 1, 'BE': 3, 'ONE': 3, 'AND': 1, 'PREFERABLY': 1, 'ONLY': 1, 'OBVIOUS': 2, 'WAY': 2, 'DO': 2, 'IT': 2, 'THAT': 1, 'MAY': 2, 'NOT': 1, 'AT': 1, 'FIRST': 1, 'YOURE': 1, 'DUTCH': 1, 'NOW': 2, 'OFTEN': 1, 'RIGHT': 1, 'IF': 2, 'IMPLEMENTATION': 2, 'HARD': 1, 'EXPLAIN': 2, 'ITS': 1, 'A': 2, 'BAD': 1, 'IDEA': 3, 'EASY': 1, 'GOOD': 1, 'NAMESPACES': 1, 'ARE': 1, 'HONKING': 1, 'GREAT': 1, '': 1, 'LETS': 1, 'MORE': 1, 'THOSE': 1}

選項2:您可以通過直接調用pd.DataFrame()並將列表轉換為 pandas dataframe 並使用value_counts()來計算單列中的不同值:

import pandas as pd
#df = pd.DataFrame(word_list)
df = pd.DataFrame({'Text': word_list})
df.value_counts()

output:

Text    
IS          10
BETTER       8
THAN         8
THE          6
TO           5
            ..
ITS          1
YOURE        1
IN           1
IMPLICIT     1
             1
Length: 86, dtype: int64

覆蓋:

...並且僅當單詞超過 3 個字符時才計算它們的出現然后按順序打印它們

您可能會使用以下內容:

for word in word_list:
    if len(word) > 3:
        #print()

您可以嘗試以下方法:

word_list = ['THE', 'ZEN', 'OF', 'PYTHON', 'BY', 'TIM', 'PETERS', 'BEAUTIFUL', 'IS', 'BETTER', 'THAN', 'UGLY', 'EXPLICIT', 'IS', 'BETTER', 'THAN', 'IMPLICIT', 'SIMPLE', 'IS', 'BETTER', 'THAN', 'COMPLEX', 'COMPLEX', 'IS', 'BETTER', 'THAN', 'COMPLICATED', 'FLAT', 'IS', 'BETTER', 'THAN', 'NESTED', 'SPARSE', 'IS', 'BETTER', 'THAN', 'DENSE', 'READABILITY', 'COUNTS', 'SPECIAL', 'CASES', 'ARENT', 'SPECIAL', 'ENOUGH', 'TO', 'BREAK', 'THE', 'RULES', 'ALTHOUGH', 'PRACTICALITY', 'BEATS', 'PURITY', 'ERRORS', 'SHOULD', 'NEVER', 'PASS', 'SILENTLY', 'UNLESS', 'EXPLICITLY', 'SILENCED', 'IN', 'THE', 'FACE', 'OF', 'AMBIGUITY', 'REFUSE', 'THE', 'TEMPTATION', 'TO', 'GUESS', 'THERE', 'SHOULD', 'BE', 'ONE', 'AND', 'PREFERABLY', 'ONLY', 'ONE', 'OBVIOUS', 'WAY', 'TO', 'DO', 'IT', 'ALTHOUGH', 'THAT', 'WAY', 'MAY', 'NOT', 'BE', 'OBVIOUS', 'AT', 'FIRST', 'UNLESS', 'YOURE', 'DUTCH', 'NOW', 'IS', 'BETTER', 'THAN', 'NEVER', 'ALTHOUGH', 'NEVER', 'IS', 'OFTEN', 'BETTER', 'THAN', 'RIGHT', 'NOW', 'IF', 'THE', 'IMPLEMENTATION', 'IS', 'HARD', 'TO', 'EXPLAIN', 'ITS', 'A', 'BAD', 'IDEA', 'IF', 'THE', 'IMPLEMENTATION', 'IS', 'EASY', 'TO', 'EXPLAIN', 'IT', 'MAY', 'BE', 'A', 'GOOD', 'IDEA', 'NAMESPACES', 'ARE', 'ONE', 'HONKING', 'GREAT', 'IDEA', '', 'LETS', 'DO', 'MORE', 'OF', 'THOSE']
for word in word_list:
    if len(word) > 3:
        print (word, ":", word_list.count(word))

或者,如果要將上述代碼段定義為 function,則可以執行以下操作:

word_list = ['THE', 'ZEN', 'OF', 'PYTHON', 'BY', 'TIM', 'PETERS', 'BEAUTIFUL', 'IS', 'BETTER', 'THAN', 'UGLY', 'EXPLICIT', 'IS', 'BETTER', 'THAN', 'IMPLICIT', 'SIMPLE', 'IS', 'BETTER', 'THAN', 'COMPLEX', 'COMPLEX', 'IS', 'BETTER', 'THAN', 'COMPLICATED', 'FLAT', 'IS', 'BETTER', 'THAN', 'NESTED', 'SPARSE', 'IS', 'BETTER', 'THAN', 'DENSE', 'READABILITY', 'COUNTS', 'SPECIAL', 'CASES', 'ARENT', 'SPECIAL', 'ENOUGH', 'TO', 'BREAK', 'THE', 'RULES', 'ALTHOUGH', 'PRACTICALITY', 'BEATS', 'PURITY', 'ERRORS', 'SHOULD', 'NEVER', 'PASS', 'SILENTLY', 'UNLESS', 'EXPLICITLY', 'SILENCED', 'IN', 'THE', 'FACE', 'OF', 'AMBIGUITY', 'REFUSE', 'THE', 'TEMPTATION', 'TO', 'GUESS', 'THERE', 'SHOULD', 'BE', 'ONE', 'AND', 'PREFERABLY', 'ONLY', 'ONE', 'OBVIOUS', 'WAY', 'TO', 'DO', 'IT', 'ALTHOUGH', 'THAT', 'WAY', 'MAY', 'NOT', 'BE', 'OBVIOUS', 'AT', 'FIRST', 'UNLESS', 'YOURE', 'DUTCH', 'NOW', 'IS', 'BETTER', 'THAN', 'NEVER', 'ALTHOUGH', 'NEVER', 'IS', 'OFTEN', 'BETTER', 'THAN', 'RIGHT', 'NOW', 'IF', 'THE', 'IMPLEMENTATION', 'IS', 'HARD', 'TO', 'EXPLAIN', 'ITS', 'A', 'BAD', 'IDEA', 'IF', 'THE', 'IMPLEMENTATION', 'IS', 'EASY', 'TO', 'EXPLAIN', 'IT', 'MAY', 'BE', 'A', 'GOOD', 'IDEA', 'NAMESPACES', 'ARE', 'ONE', 'HONKING', 'GREAT', 'IDEA', '', 'LETS', 'DO', 'MORE', 'OF', 'THOSE']

def count_occurrences (list_of_words):
    for word in list_of_words:
        if len(word) > 3:
            print (word, ":", list_of_words.count(word))
count_occurrences(word_list)

那么,output就是:

PYTHON : 1
PETERS : 1
BEAUTIFUL : 1
BETTER : 8
THAN : 8
UGLY : 1
EXPLICIT : 1
BETTER : 8
THAN : 8
IMPLICIT : 1
SIMPLE : 1
BETTER : 8
THAN : 8
COMPLEX : 2
COMPLEX : 2
BETTER : 8
THAN : 8
COMPLICATED : 1
FLAT : 1
BETTER : 8
THAN : 8
NESTED : 1
SPARSE : 1
BETTER : 8
THAN : 8
DENSE : 1
READABILITY : 1
COUNTS : 1
SPECIAL : 2
CASES : 1
ARENT : 1
SPECIAL : 2
ENOUGH : 1
BREAK : 1
RULES : 1
ALTHOUGH : 3
PRACTICALITY : 1
BEATS : 1
PURITY : 1
ERRORS : 1
SHOULD : 2
NEVER : 3
PASS : 1
SILENTLY : 1
UNLESS : 2
EXPLICITLY : 1
SILENCED : 1
FACE : 1
AMBIGUITY : 1
REFUSE : 1
TEMPTATION : 1
GUESS : 1
THERE : 1
SHOULD : 2
PREFERABLY : 1
ONLY : 1
OBVIOUS : 2
ALTHOUGH : 3
THAT : 1
OBVIOUS : 2
FIRST : 1
UNLESS : 2
YOURE : 1
DUTCH : 1
BETTER : 8
THAN : 8
NEVER : 3
ALTHOUGH : 3
NEVER : 3
OFTEN : 1
BETTER : 8
THAN : 8
RIGHT : 1
IMPLEMENTATION : 2
HARD : 1
EXPLAIN : 2
IDEA : 3
IMPLEMENTATION : 2
EASY : 1
EXPLAIN : 2
GOOD : 1
IDEA : 3
NAMESPACES : 1
HONKING : 1
GREAT : 1
IDEA : 3
LETS : 1
MORE : 1
THOSE : 1

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM