簡體   English   中英

從列表中刪除單詞,只有第一個 if 語句正在執行

[英]Removing words from list, only first if statement is being executed

我有一長串難以破譯的文本,每一行都被括號截斷(只包括一個,因為我無法讓這個程序在一行上運行):

"Thyroid Disorders   Understanding Concepts  Kaplan Endocrine Focused Review Tests   n/a 88% (35/40)"

我正在嘗試像這樣格式化它,並將 append 到一個文件中:

"Thyroid Disorders Understanding Concepts 88% (35/40)"

所以我需要從每個字符串中刪除字符串“Kaplan”、“Endocrine”、“Focused”、“Review”、“Tests”和“n/a”,並去掉制表符/換行符。

這是我的代碼:

text = """Thyroid Disorders Understanding Concepts  Kaplan Endocrine A  Focused Review Tests    n/a 88% (35/40)
"""

line = ''
for character in text:
    line = line + character # append every character to string
    if character == ')': #  closing parenthesis signals end of one line
        print('Original line: '+ line) # sanity check 
        line_as_list = line.split() # removes tabs/newlines and makes it easier to remove certain strings
        for word in line_as_list: # loop through each list item, remove if needed
            if word == 'Kaplan':
                line_as_list.remove(word)
                print(line_as_list) # another sanity check, 'Kaplan' is gone

            if word == 'Endocrine': # never runs
                line_as_list.remove(word)
                print(line_as_list )
            
            # Intentionally left out the rest of the words that need to be removed

這將返回以下內容:

"Original line: Thyroid Disorders    Understanding Concepts   Kaplan Endocrine A   Focused Review Tests n/a   88% (35/
40)"
['Thyroid', 'Disorders', 'Understanding', 'Concepts', 'Endocrine', 'A', 'Focused', 'Review', 'Tests',
'n/a', '88%', '(35/40)']

第一個if語句下的代碼按我的意圖執行,但if word == 'Endocrine'下的代碼塊永遠不會運行。

我試過了

if word == 'Kaplan' or word == 'Endocrine':
  line_as_list.remove(word)

if word == 'Kaplan':
  line_as_list.remove(word)
elif word == 'Endocrine':
  line_as_list.remove(word)

兩者都無效,“卡普蘭”是唯一被刪除的詞。 感謝您對此的任何幫助。

問題描述

問題是您正在改變您當前正在迭代的列表。 由於KaplanEndocrine緊隨其后,因此 Endocrine 將被跳過,因為它接管了Kaplan的索引,並且循環繼續到下一個索引(這是 Endocrine 的舊索引)。 如果您在自己的代碼中在 Kaplan 和 Endocrine 之間添加另一個字符串,這很容易說明,並且您會看到兩者都被刪除,因為中間的單詞會被跳過。

解決方案

最佳做法是創建一個沒有您要刪除的項目的新列表,而不是改變輸入列表。

我建議使用列表理解來解決它並創建一個新列表。

text = """Thyroid Disorders Understanding Concepts  Kaplan Endocrine A  Focused Review Tests    n/a 88% (35/40)
"""

line = ''
for character in text:
    line += character # append every character to string
    if character == ')': #  closing parenthesis signals end of one line
        print('Original line: '+ line) # sanity check 
        new_list = [word for word in line.split() if word not in ["Kaplan", "Endocrine"]] # loop through each list item, remove if needed
        print(new_list)

此處出現錯誤的原因是,remove 會在退后一步后拉出所有元素,並且迭代器不會更新,因此在 Thyroid 被刪除后,內分泌在它的 position 中並且不再被觸發。 一個簡單的解決方法是:

text = """Thyroid Disorders Understanding Concepts  Kaplan Endocrine A  Focused Review Tests    n/a 88% (35/40)
"""

line = ''
print([char for char in text.split()])
for character in text:
    line = line + character # append every character to string
    if character == ')': #  ')' signals end of one line
        print('Original line: '+ line) # sanity check 
        line_as_list = line.split()
        if "Kaplan" in line_as_list:
            line_as_list.remove("Kaplan")
        if "Endocrine" in line_as_list:
            line_as_list.remove("Endocrine")

更改此行:

for word in line_as_list:

至:

for word in line_as_list.copy():

這樣,當您從原始列表中刪除“Kaplan”時,它不會影響列表上的迭代。

試試下面

text = "Thyroid Disorders Understanding Concepts  Kaplan Endocrine A  Focused Review Tests    n/a 88% (35/40)"
words_to_remove = {'Kaplan', 'Endocrine', 'Focused', 'Review', 'Tests', 'n/a'}
print(' '.join([w for w in text.split() if w not in words_to_remove]))

output

Thyroid Disorders Understanding Concepts A 88% (35/40)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM