简体   繁体   English

使用python将一个文件中的文本部分替换为另一个文件中的文本

[英]Replacing a section of text in one file with text from another using python

First a bit of background: yes, I am new to python, but I like to dabble and learn things. 首先介绍一下背景:是的,我是python的新手,但是我喜欢涉猎和学习东西。

The goal is this: I have an intranet website here at work and they have us on a static server with no server side scripting allowed, meaning no PHP. 目的是这样的:我在这里有一个Intranet网站,他们在静态服务器上拥有我们,不允许服务器端脚本,这意味着没有PHP。 So as I add new pages, I must update every freakin page's menu with the new link. 因此,当我添加新页面时,必须使用新链接更新每个freakin页面的菜单。 Fortunately, I have an application called ArcGIS installed on my computer and with it came an installation of python. 幸运的是,我在计算机上安装了一个名为ArcGIS的应用程序,并附带了python的安装。 So I was thinking it would be nice to just put together a script that would read a file called "menu.txt" and then search recursively in my directory (and subdirectories) for files with ".html" and replace all text between some comment tags, like <!--begin menu--> and <!--end menu--> , with the text in "menu.txt" 因此,我认为将脚本读入一个名为“ menu.txt”的脚本,然后在目录(和子目录)中以“ .html”递归搜索,并将所有文本替换为某个注释,将是很好的选择。标签,例如<!--begin menu--><!--end menu--> ,其文本在“ menu.txt”中

So I started looking and found this snippet of code: 因此,我开始寻找并发现以下代码片段:

with open('menu.txt', 'r') as f:
    # read entire file into file1
    # ASSUMING 'file1.txt' is relatively small...
    file1 = f.read()

with open('test.html', 'r') as f:
    # ASSUMING 'file2.txt' is relatively small...
    file2 = f.read()    # read file into file2

# index() will raise an error if not found...
f1_start = file1.index('<!--123-->')
f1_end = file1.index('<!--321-->', f1_start)     # look for '//end' after '//start'

f2_start = file2.index('<!--123-->')
f2_end = file2.index('<!--321-->', f2_start)

# replace file2 lines with proper file1 lines
file2[f2_start:f2_end] = file1[f1_start:f1_end]

with open('test.html', 'w') as f:
    f.write(file2)

and I've also seen many examples using re , replace and such, but nothing that seems to be related to what i need. 而且我也看到了许多使用rereplace等的示例,但似乎与我需要的内容无关。 Anyway, right now I'm just trying it on the one file in the same directory, but when I run this on either my linux machine or in a windows python shell I get: 无论如何,现在我只是在同一目录中的一个文件上尝试它,但是当我在Linux机器上或Windows python shell中运行它时,我得到:

Traceback (most recent call last):
  File "P:\webpages\filereplace.py", line 18, in <module>
    file2[f2_start:f2_end] = file1[f1_start:f1_end]
TypeError: 'str' object does not support item assignment

I thought the problem might've been the with open part, but I don't know. 我认为问题可能出with open部分,但我不知道。

In this case the contents of menu.txt are essentially a beginning comment tag <!--123--> , then all of the <div id=menu>blah blah blah</div> , then an end comment tag <!--321--> . 在这种情况下,menu.txt的内容本质上是一个开始注释标签<!--123--> 123- <!--123--> ,然后是所有<div id=menu>blah blah blah</div> ,然后是结束注释标签<!--321--> In my html file, I use the same comment tags, and you get the picture... 在我的html文件中,我使用了相同的注释标签,然后您得到了图片...

Any suggestions? 有什么建议么?

You are trying to modify a string in-place. 您正在尝试就地修改字符串。 This is not possible in python, because strings are immutable. 在python中这是不可能的,因为字符串是不可变的。

To achieve what you want, you would need create a new string from the parts of existing two strings: 为了实现所需的功能,您需要从现有的两个字符串的一部分中创建一个新字符串:

# replace file2 lines with proper file1 lines
new_f = file2[:f2_start] + file1[f1_start:f1_end] + file2[f2_end:]

After that, write the contents to the file like so: 之后,将内容写入文件,如下所示:

with open('test.html', 'w') as f:
    f.write(new_f)

Also, note that variable names file1 and file2 are a bit misleading here, as these are not file-like objects, but strings. 另外,请注意,变量名file1file2在这里有点误导,因为它们不是类似于文件的对象,而是字符串。

Most of the time, when dealing with in-place editing of files, I turn to the fileinput module: 大多数时候,在处理文件的就地编辑时,我转向文件fileinput模块:

import os
import fileinput

if __name__ == '__main__':
    # Menu should not have any marker, just pure contents
    with open('menu.txt') as f:
        menu_contents = f.read()

    # Initialize a few items
    start_marker = '<!--123-->'
    end_marker   = '<!--321-->'
    file_list = ['index.html', 'foo.html']
    found_old_contents = False

    # Loop to replace text in place
    for line in fileinput.input(file_list, inplace=True):
        line = line.rstrip()

        if line == start_marker:
            found_old_contents = True
            print line
            print menu_contents
        elif line == end_marker:
            found_old_contents = False

        if not found_old_contents:
            print line

Discussion 讨论

The key here is in the function fileinput.input(file_list, inplace=True) , which takes a list of file names, iterates through them line-by-line, and writes back to the files whatever you print out. 此处的键位于函数fileinput.input(file_list, inplace=True) ,该函数fileinput.input(file_list, inplace=True)文件名列表,逐行进行遍历,然后将print出的内容写回到文件中。

You will need to come up with a list of files (eg file_list ) by means of os.walk() or some other methods. 您将需要通过os.walk()或其他方法来提供文件列表(例如file_list )。

I have tested my code against two .html files and confident that it works. 我已经针对两个.html文件测试了我的代码,并确信它可以工作。 I cannot guaranty the results, especially for nested directories. 我无法保证结果,尤其是对于嵌套目录。 Good luck. 祝好运。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM