[英]iterating over file object in Python does not work, but readlines() does but is inefficient
In the following code, if I use: 在以下代码中,如果我使用:
for line in fin:
It only executes for 'a' 它仅针对“ a”执行
But if I use: 但是,如果我使用:
wordlist = fin.readlines()
for line in wordlist:
Then it executes for a thru z. 然后执行一遍z。
But readlines()
reads the whole file at once, which I don't want. 但是
readlines()
一次读取整个文件,我不希望这样。
How to avoid this? 如何避免这种情况?
def avoids():
alphabet = 'abcdefghijklmnopqrstuvwxyz'
num_words = {}
fin = open('words.txt')
for char in alphabet:
num_words[char] = 0
for line in fin:
not_found = True
word = line.strip()
if word.lower().find(char.lower()) != -1:
num_words[char] += 1
fin.close()
return num_words
the syntax for line in fin
can only be used once. for line in fin
的语法只能使用一次。 After you do that, you've exhausted the file and you can't read it again unless you "reset the file pointer" by fin.seek(0)
. 完成此操作后,您已经用尽了文件,除非您通过
fin.seek(0)
“重置文件指针”,否则无法再次读取文件。 Conversely, fin.readlines()
will give you a list which you can iterate over and over again. 相反,
fin.readlines()
将为您提供一个列表,您可以反复遍历。
I think a simple refactor with Counter
(python2.7+) could save you this headache: 我认为使用
Counter
(python2.7 +)进行简单的重构可以为您省去麻烦:
from collections import Counter
with open('file') as fin:
result = Counter()
for line in fin:
result += Counter(set(line.strip().lower()))
which will count the number of words in your file (1 word per line) that contain a particular character (which is what your original code does I believe ... Please correct me if I'm wrong) 它将计算文件中包含特定字符的单词数(每行1个单词)(我相信这是您的原始代码...如果我错了,请更正我)
You could also do this easily with a defaultdict
(python2.5+): 您也可以使用
defaultdict
(python2.5 +)轻松完成此操作:
from collections import defaultdict
with open('file') as fin:
result = defaultdict(int)
for line in fin:
chars = set(line.strip().lower())
for c in chars:
result[c] += 1
And finally, kicking it old-school -- I don't even know when setdefault
was introduced...: 最后,把它踢得很老套-我什至不知道什么时候引入了
setdefault
...:
fin = open('file')
result = dict()
for line in fin:
chars = set(line.strip().lower())
for c in chars:
result[c] = result.setdefault(c,0) + 1
fin.close()
You have three options: 您有三种选择:
Try: 尝试:
from collections import defaultdict
from itertools import product
def avoids():
alphabet = 'abcdefghijklmnopqrstuvwxyz'
num_words = defaultdict(int)
with open('words.txt') as fin:
words = [x.strip() for x in fin.readlines() if x.strip()]
for ch, word in product(alphabet, words):
if ch not in word:
continue
num_words[ch] += 1
return num_words
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.