[英]Python script to remove lines from file containing words in array
I have the following script which identifies lines in a file which I want to remove, based on an array but does not remove them. 我有以下脚本,该脚本基于数组标识要删除的文件中的行,但不删除它们。
What should I change? 我应该改变什么?
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup( filename ):
print "fixup ", filename
fin = open( filename )
fout = open( filename2 , "w")
for line in fin.readlines():
for item in offending:
print "got one",line
line = line.replace( item, "MUST DELETE" )
line=line.strip()
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup( filename ):
fin = open( filename )
fout = open( filename2 , "w")
for line in fin:
if True in [item in line for item in offending]:
continue
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
EDIT : Or even better: 编辑 :甚至更好:
for line in fin:
if not True in [item in line for item in offending]:
fout.write(line)
The basic strategy is to write a copy of the input file to the output file, but with changes. 基本策略是将输入文件的副本写入输出文件,但要进行更改。 In your case, the changes are very simple: you just omit the lines you don't want. 对于您而言,更改非常简单:您只需要省略不需要的行。
Once you have your copy safely written, you can delete the original file and use 'os.rename()' to rename your temp file to the original file name. 一旦安全地编写了副本,就可以删除原始文件,并使用'os.rename()'将临时文件重命名为原始文件名。 I like to write the temp file in the same directory as the original file, to make sure I have permission to write in that directory and because I don't know if os.rename()
can move a file from one volume to another. 我喜欢将temp文件写入与原始文件相同的目录中,以确保我有权在该目录中写入,并且因为我不知道os.rename()
可以将文件从一个卷移动到另一个卷。
You don't need to say for line in fin.readlines()
; 您无需for line in fin.readlines()
说for line in fin.readlines()
; it is enough to say for line in fin
. for line in fin
说够了。 When you use .readlines()
you are telling Python to read every line of the input file, all at once, into memory; 使用.readlines()
,是在告诉Python将输入文件的每一行一次全部读入内存; when you just use fin
by itself you read one line at a time. 当您只使用fin
时,您一次只能读取一行。
Here is your code, modified to do these changes. 这是您的代码,进行了修改以进行这些更改。
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def line_offends(line, offending):
for word in line.split():
if word in offending:
return True
return False
def fixup( filename ):
print "fixup ", filename
fin = open( filename )
fout = open( filename2 , "w")
for line in fin:
if line_offends(line, offending):
continue
fout.write(line)
fin.close()
fout.close()
#os.rename() left as an exercise for the student
fixup(sourcefile)
If line_offends()
returns True, we execute continue
and the loop continues without executing the next part. 如果line_offends()
返回True,我们将continue
执行并且循环将继续而不执行下一部分。 That means the line never gets written. 这意味着该行永远不会被写入。 For this simple example, it would really be just as good to do it this way: 对于这个简单的示例,以这种方式进行操作确实一样好:
for line in fin:
if not line_offends(line, offending):
fout.write(line)
I wrote it with the continue
because often there is non-trivial work being done in the main loop, and you want to avoid all of it if the test is true. 我用continue
编写它是因为在主循环中经常要做一些不平凡的工作,并且如果测试是正确的,则您希望避免所有这些工作。 IMHO it is nicer to have a simple "if this line is unwanted, continue" rather than indenting a whole bunch of stuff inside an if
for a condition that might be very rare. 恕我直言,最好有一个简单的“如果不需要此行,请继续”,而不是在可能非常罕见的情况下在if
缩进一堆东西。
You're not writing it to the output file. 您没有将其写入输出文件。 Also, I would use "in" to check for the string existing in the line. 另外,我将使用“ in”来检查该行中是否存在字符串。 See the modified script below (not tested): 请参阅下面的修改后的脚本(未经测试):
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup( filename ):
print "fixup ", filename
fin = open( filename )
fout = open( filename2 , "w")
for line in fin.readlines():
if not offending in line:
# There are no offending words in this line
# write it to the output file
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
'''This is a rather simple implementation but should do what you are searching for''' '''这是一个非常简单的实现,但是应该执行您要搜索的操作'''
sourcefile = "C:\\Python25\\PC_New.txt"
filename2 = "C:\\Python25\\PC_reduced.txt"
offending = ["Exception","Integer","RuntimeException"]
def fixup( filename ):
print "fixup ", filename
fin = open( filename )
fout = open( filename2 , "w")
for line in fin.readlines():
for item in offending:
print "got one",line
line = line.replace( item, "MUST DELETE" )
line=line.strip()
fout.write(line)
fin.close()
fout.close()
fixup(sourcefile)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.