[英]Python - closing a file if it meets a condition
I am trying to do a task where the programme goes through a directory, opens each file by turn, and checks a specific line before anything else. 我正在尝试执行一个程序,其中程序通过目录,依次打开每个文件,并在其他之前检查特定行。 If the line meets a specific criteria (namely, that it does not match this line in any other file in the directory), the file closes and the programme moves onto the next file. 如果该行符合特定条件(即,它与目录中任何其他文件中的该行都不匹配),则该文件将关闭,程序将移至下一个文件。
aps = []
import os
for filename in os.listdir("C:\..."):
f = open(filename,"r")
(f.readline())
(f.readline())
ap = (f.readline())
ap = ap.rstrip("\n")
aps.append(ap)
freqs = {}
for ap in aps:
freqs[ap] = freqs.get(ap, 0) + 1
for k, v in freqs.items():
if v == 2:
f.close()
else:
For the 'else:', I originally tried 'f.seek(0)', but got the error of Python being unable to work with a closed file. 对于“ else:”,我最初尝试使用“ f.seek(0)”,但遇到了Python无法处理关闭文件的错误。 I then tried 'f = open(filename, "r")' again, but this is doing something odd, as when I try to print the first line through this method it sends it on a crazy loop and prints the line multiple times. 然后,我再次尝试了'f = open(filename,“ r”)',但是这样做有些奇怪,因为当我尝试通过此方法打印第一行时,它会在疯狂的循环中将其发送并多次打印该行。
Is this the best way to go about this task? 这是完成此任务的最佳方法吗? And if not, how could I get it to work? 如果没有,我如何使它工作?
Many thanks. 非常感谢。
Don't close the file conditionally. 不要有条件地关闭文件。 Do what you need to do with the open file, and then close it at the end. 对打开的文件执行所需的操作,然后最后将其关闭。 With a with
construct the file will close automatically: 使用with
构造,文件将自动关闭:
for filename in os.listdir(path):
with open(filename) as f:
# do processing here
if positive_condition:
# do more processing
Here is why your code fails. 这就是您的代码失败的原因。 You initialize the aps
list outside of your outer for loop, so it will contain the specified line from all files that you loop over. 您在外部for循环之外初始化aps
列表,因此它将包含您循环遍历的所有文件中的指定行。 Then your freqs
dictionary is reset to empty for each file that you open. 然后,您打开的每个文件的freqs
词典都会重置为空。
So these lines: 所以这些行:
for ap in aps:
freqs[ap] = freqs.get(ap, 0) + 1
loop over each line that has been read so far, and count the frequency. 循环浏览到目前为止已读取的每一行,并计算频率。 The problem comes in the inner for loop: 问题出在内部for循环中:
for k, v in freqs.items():
if v == 2:
f.close()
What happens here is that freqs
has a set of keys potentially as large as the number of files you have looped over so far, and you are looping through each key. 此处发生的情况是, freqs
具有一组密钥,该密钥可能与到目前为止循环的文件数量一样大,并且您正在循环浏览每个密钥。 So the first time a key has a value of 2, the current file is closed. 因此,第一次键的值为2时,当前文件将关闭。 But then the loop continues, so the next time a key has a value of 2, python tries to close the file, but it is already closed. 但是随后循环继续,因此下一次键的值为2时,python尝试关闭文件,但该文件已关闭。
The easiest fix is to add a break
after the f.close()
. 最简单的解决方法是在f.close()
之后添加一个break
。 But there are better ways to structure this code. 但是,有更好的方法来构造此代码。
One is to always open a file using a with
command, unless you have a good reason to do otherwise. 一种方法是始终使用with
命令打开文件,除非您有充分的理由这样做。 So: 所以:
with open(filename,"r") as f:
#code
That way the file will close automatically when you are done with it. 这样,完成处理后文件将自动关闭。
I am assuming that the order you are looping through the files isn't important, and that you want the frequency test to include all the files, not just the ones that have been opened so far. 我假设您循环浏览文件的顺序并不重要,并且您希望频率测试包括所有文件,而不只是到目前为止已打开的文件。 In that case it may be easier to loop through twice, once for assembling your frequency dict, and a second time for doing whatever you want to do to the files that meet frequency requirements. 在这种情况下,遍历两次可能更容易,一次是汇编频率指令,第二次是对满足频率要求的文件执行您想做的任何事情。
aps = []
freqs = {}
# First loop to read the important line from all files
for filename in os.listdir("C:\..."):
with open(filename,"r") as f:
f.readline()
f.readline()
ap = f.readline().rstrip("\n")
aps.append(ap)
# Populate the dictionary
for ap in aps:
freqs[ap] = freqs.get(ap, 0) + 1
# Second loop to handle the important cases
for filename in os.listdir("C:\..."):
with open(filename,"r") as f:
f.readline()
f.readline()
ap = f.readline().rstrip("\n")
if freqs[ap] != 2:
#do whatever
I strongly suspect there are more efficient and pythonic ways of getting there, but this is my best thought. 我强烈怀疑有更高效,更Python化的方法可以到达那里,但这是我的最佳想法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.