如何修改我的函數以使用列表理解？

Question

特別是在從此 getwords 函數中刪除停止字母時。

def getwords(fileName):
  file = open(fileName, 'r')
  text = file.read()
  stopletters = [".", ",", ";", ":", "'s", '"', "!", "?", "(", ")", '“', '”']
  text = text.lower()
  for letter in stopletters:
   text = text.replace(letter, "")
  words = text.split()
  return words

對於這個 bigrams 函數中的循環

def compute_bigrams(fileName):
  input_list = getwords(fileName)
  bigram_list = {}
  for i in range(len(input_list) - 1):
    if input_list[i] in bigram_list:
      bigram_list[input_list[i]] = bigram_list[input_list[i]] + [input_list[i + 1]]
    else :
     bigram_list[input_list[i]] = [input_list[i + 1]]
  return bigram_list

Answer 1

你可以用這種方式重寫它：

def getwords(file_name):
    with open(file_name, 'r') as file:
        text = file.read().lower()

    stop_letters = (".", ",", ";", ":", "'s", '"', "!", "?", "(", ")", '“', '”')
    text = ''.join([letter if letter not in stop_letters else '' for letter in text])

    words = text.split()
    return words

我使用上下文管理器打開文件，合並了一些行（不需要為.lower()設置特殊行）並使用列表理解來.lower()文本並添加字母，但stop_letters是該字母不在stop_letters 。 加入該列表后，您會得到相同的結果。

請注意，您也可以使用生成器表達式，這會更好：

text = ''.join((letter if letter not in stop_letters else '' for letter in text))

如果你真的想保存那一行，你可以這樣做：

return text.split()

Answer 2

您可以通過加入一點正則表達式來進行第一次替換而無需 for 循環：

import re

pattern = re.compile('''[.,;:"!?()“”]*|'s*''')
pattern.sub('', 'this is a test string (it proves that the replacements work!).')


>>> 'this is a test string it proves that the replacements work'

盡管理論上可以將您的第二個循環變成理解，但我強烈建議您不要這樣做。 人們，包括幾個月后的你自己，在理解它的作用時會遇到嚴重的問題。 正如@Alexander Cécile 在評論中指出的那樣，您可以for input in input_list利用for input in input_list重構第二個循環，從而提高代碼的性能和可讀性

如何修改我的函數以使用列表理解？

問題描述

2 個解決方案

解決方案1
2 已采納 2019-12-07 20:45:29

解決方案2
2 2019-12-07 20:46:39

如何修改我的函數以使用列表理解？

問題描述

2 個解決方案

解決方案1 2 已采納 2019-12-07 20:45:29

解決方案2 2 2019-12-07 20:46:39

解決方案1
2 已采納 2019-12-07 20:45:29

解決方案2
2 2019-12-07 20:46:39