简体   繁体   English

在字符串匹配后将行插入文件

[英]Inserting lines to a file after a string match

I'm trying to search for a substring within lines of a file and insert similar lines immediately after the found line. 我正在尝试在文件的行内搜索子字符串,并在找到的行之后立即插入类似的行。 Although there were similar solutions using the fileinput method, I could not figure out how to use it in my case. 虽然使用fileinput方法有类似的解决方案,但我无法弄清楚如何在我的情况下使用它。

Here is what I have tried: 这是我尝试过的:

list = ["abc", "pqr", "xyz"]

inputfile = open (somefile.txt, 'a+')
for line in <inputfile>:    
    if 'stringstosearch' in line:
       for <item> in list:
               new_line = "new_line with %s" %(item)        
               inputfile.write(new_line + "\n") 
    break
inputfile.close()

for example if the text file is: 例如,如果文本文件是:

Torquent scelerisque aptent hac rhoncus vel Torquent scelerisque aptent hac rhoncus vel
Turpis vestibulum tellus laoreet mollis conubia facilisis tempor nec semper Turpis vestibulum tellus laoreet mollis conubia facilisis tempor nec semper
In mi mauris etiam quisque sem congue est velit lacus convallis amet ante ad 在mi mauris etiam quisque sem congue est velit lacus convallis amet ante ad
Integer maecenas semper quisque nisi hendrerit, libero feugiat cursus euismod accumsan Integer maecenas semper quisque nisi hendrerit,libero feugiat cursus euismod accumsan
Dui sed magna vivamus augue ac quisque ac mauris torquent eros taciti Dui sed magna vivamus augue ac quisque ac mauris torquent eros taciti
Conubia curae vel himenaeos dictumst sed at Conubia curae vel himenaeos dictumst sed at

string to search = "mauris etiam quisque"
list = ["abc", "pqr", "xyz" ]

Expected output after file write: 文件写入后的预期输出:

Torquent scelerisque aptent hac rhoncus vel Torquent scelerisque aptent hac rhoncus vel
Turpis vestibulum tellus laoreet mollis conubia facilisis tempor nec semper Turpis vestibulum tellus laoreet mollis conubia facilisis tempor nec semper
In mi mauris etiam quisque sem congue est velit lacus convallis amet ante ad 在mi mauris etiam quisque sem congue est velit lacus convallis amet ante ad
new_line with abc new_line with abc
new_line with pqr new_line with pqr
new_line with xyz new_line with xyz
Integer maecenas semper quisque nisi hendrerit, libero feugiat cursus euismod accumsan Integer maecenas semper quisque nisi hendrerit,libero feugiat cursus euismod accumsan
Dui sed magna vivamus augue ac quisque ac mauris torquent eros taciti Dui sed magna vivamus augue ac quisque ac mauris torquent eros taciti
Conubia curae vel himenaeos dictumst sed at Conubia curae vel himenaeos dictumst sed at

You can't generally insert into the middle of a file.* 您通常不能插入文件的中间。*

The generic solution to this is to copy to a new file, inserting in the midst of copying, and then move the new file on top of the old one. 对此的通用解决方案是复制到新文件,在复制过程中插入,然后将新文件移到旧文件之上。 For example: 例如:

with tempfile.NamedTemporaryFile('w', delete=False) as outfile:
    with open(inpath) as infile, 
        for line in infile:
            outfile.write(line)
            if needs_inserting_after(line):
                outfile.write(stuff_to_insert_after(line))
os.replace(outfile.name, inpath)

Note that os.replace doesn't exist in Python 2.7. 请注意,Python 2.7中不存在os.replace If you don't care about Windows, you can use os.rename instead. 如果您不关心Windows,则可以使用os.rename If you do, I'd strongly suggest looking for a backport of os.replace on PyPI; 如果你这样做,我强烈建议在PyPI上寻找os.replace ; there are at least two of them. 至少有两个。 Otherwise, you have to learn about the whole mess with exclusive locks and atomic moves on Windows. 否则,您必须了解Windows上的独占锁和原子移动的整个混乱。

There are also some higher-level libraries that wrap the whole thing up for you. 还有一些更高级别的库可以为您完成整个过程。 (I wrote one called fatomic that I think serves as nice sample code, but I'm not sure I'd trust it for production code without a lot more testing. I'm sure if you search PyPI you can find other alternatives.) (我写了一个名为fatomic ,我觉得它可以作为很好的示例代码,但我不确定如果没有更多的测试,我会相信它的生产代码。我敢肯定,如果你搜索PyPI,你可以找到其他替代品。)


Of course there are alternatives: 当然还有其他选择:

You can move the original file to a backup path, then copy it into a new file at the normal path, instead of copying to a new file at a temporary path and then moving after the fact. 您可以将原始文件移动到备份路径,然后将其复制到正常路径的新文件中,而不是在临时路径上复制到新文件,然后在事后移动。 This has the disadvantage of leaving you with half a file if you fail in the middle, but the advantage of not needing to deal with the exclusive-locks-on-Windows problem. 如果您在中间失败,这样做的缺点是会留下半个文件,但不需要处理Windows上的独占锁定问题。 This is effectively what fileinput.FileInput with inplace=True automates for you. 这实际上是fileinput.FileInput with fileinput.FileInput inplace=True自动为您服务。

You can read the whole file into memory, process it in-memory, then write the whole file back out. 您可以将整个文件读入内存,在内存中处理它,然后将整个文件写回。 This has the advantage of being dead simple, not needing any extra files, and meaning that if anyone has a file handle to your file (rather than a pathname) they see the new version once you're done. 这样做的好处是简单易用,不需要任何额外的文件,并且意味着如果任何人拥有文件的文件句柄(而不是路径名),他们会在完成后看到新版本。 But the last of those can be a disadvantage. 但最后一个可能是一个劣势。 And of course this means that you need enough memory to hold all your data at once. 当然,这意味着您需要足够的内存来同时保存所有数据。

Finally, you can always shift the whole file from the current position up by N bytes before writing N bytes. 最后,在写入N个字节之前,您始终可以将整个文件从当前位置向上移动N个字节。 This has most of the advantages of both of the above, but it's also messy and slow. 这具有上述两者的大部分优点,但它也是混乱和缓慢的。


* Why did I say "generally" there? *为什么我说“一般”在那里? Well, ultimately, the filesystem has to have some way of inserting a new block in the middle of a file. 好吧,最终,文件系统必须有一些方法在文件中间插入一个新块。 And some filesystems will expose this to the user level. 一些文件系统会将其暴露给用户级别。 Some older platforms used to have user-level features built on top of this, like "random access text files" on Apple ][ ProDOS or the thingy I forget in VMS. 一些较旧的平台过去常常在此基础上构建用户级功能,例如Apple上的“随机访问文本文件”[ProDOS或我在VMS中遗忘的东西。 So, it's not literally true that you can't ever insert into the middle of a file. 所以,你不能插入文件的中间并不是真的。 It's just true in every case you care about. 在你关心的每一种情况下都是如此。

you cant just insert in middle of file,so 1st read the file entirely, for small files. 你不能只是插入文件的中间,所以第一次完全读取文件,对于小文件。 then open the same file in write mode and append when you find the string. 然后在写入模式下打开相同的文件,并在找到字符串时追加。

list = ["abc", "pqr", "xyz"]

inputfile = open('somefile.txt', 'r').readlines()
write_file = open('somefile.txt','w')
for line in inputfile:
    write_file.write(line)
    if 'stringstosearch' in line:
       for item in list:
            new_line = "new_line with %s" %(item)        
            write_file.write(new_line + "\n") 
write_file.close()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM