简体   繁体   English

Python:在满足特定条件的特定行之后,将整个文件写入另一个文件

[英]Python: Write an entire file to another file after a specific line that meets a particular condition

I'm working on a Python script that when I run it will cobble together text from different files so that I can create alternative versions of a website to easily compare different designs and also make sure that they all have the same inherent data, viz. 我正在使用Python脚本运行该脚本,它将在运行时将来自不同文件的文本拼凑在一起,以便我可以创建网站的替代版本以轻松比较不同的设计,并确保它们都具有相同的固有数据,即。 that the menu items are consistent across all versions. 菜单项在所有版本中都是一致的。

One specific problem area is making sure that the menu, which is cornucopia of different meals, is always the same. 一个特定的问题领域是确保菜单(这是聚餐的聚宝盆)始终是相同的。 Therefore I've made this function: 因此,我做了这个功能:

def insert_menu():
    with open("index.html", "r+") as index:
        with open("menu.html", "r+") as menu:
            for i in index:
                if "<!-- Insert Menu here -->" in i:
                    for j in menu:
                        index.write(j)

However, it doesn't behave the way I want it to because I have been unable to find a way to extract what I need from other answers here on Stack Overflow. 但是,它没有按照我想要的方式运行,因为我一直无法找到一种方法来从Stack Overflow上的其他答案中提取所需的内容。

In it's current state it will append the text that I have stored in menu.html at the end of index.html . 在当前状态下,它将在index.html末尾附加我存储在menu.html中的文本。

I want it to write the text in menu.html below the line, which is in index.html (but not necessarily always at the same line number, therefore ruling out the option of writing at a specific line) , containing <!-- Insert Menu here --> . 我希望它在index.html的行下方的menu.html中写入文本(但不一定总是在同一行号,因此排除了在特定行写入的选项),其中包含<!-- Insert Menu here --> Then, after everything inside of menu.html has been written to index.html I want for index.html to "continue" so to speak. 然后,将menu.html内的所有内容都写到index.html我想让index.html继续“可以说”。

Basically I mean to wedge the text from menu.html into index.html after the line containing <!-- Insert Menu here --> but there is more text underneath that comment that I must retain (scripts and et al.) 基本上,我的意思是在包含<!-- Insert Menu here -->的行之后,将menu.html的文本插入index.html ,但是在该注释下还有更多文本必须保留(脚本等)。

Copied from the index.html document, this is what surrounds <!-- Insert Menu here --> : index.html文档复制而来,这就是<!-- Insert Menu here -->周围:

<html>
    <body>
        <div id="wrapper">
            <div id="container">

                <div class="map">
                </div>

                <!-- Insert Menu here -->

            </div><!-- Container ends here -->
        </div><!-- Wrapper ends here -->
    </body>
</html>

Note that in index.html the above block is all indented inside a larger div, and I cannot easily replicate this here on SO. 请注意,在index.html ,以上代码块都缩进了一个较大的div中,因此我无法在SO上轻松复制此代码。

How can I change my code to achieve the desired result, or am I going about this in a very roundabout way? 如何更改我的代码以获得所需的结果,还是要以一种非常round回的方式进行处理?

And, how can I clarify this question to help you help me? 而且,我如何澄清这个问题来帮助您?

Can you try this and tell me if it works? 您可以尝试一下,告诉我是否可行吗? Since I don't have your file, assume you have read it and it is stored in string. 由于我没有您的文件,因此请假设您已阅读并存储在字符串中。

string = "blablabla \n  blebleble \n <!-- Insert Menu here --> \n bliblibli"
pattern = ".*<!-- Insert Menu here -->(.*)"
print re.match(pattern, string, re.S).groups()

This will match whatever comes after the Insert Menu -->, including any spaces and the line jump. 这将匹配插入菜单->之后的所有内容,包括所有空格和行跳转。 If you want to skip until the next line: 如果要跳过到下一行:

pattern = ".*<!-- Insert Menu here --> .*\n (.*)"

Update: Just realized this implies reading the whole file, is that acceptable? 更新:刚刚意识到这意味着读取整个文件,可以接受吗? Cheers! 干杯!

Assuming that the files aren't huge, a simple string replacement will work: 假设文件不是很大,则可以进行简单的字符串替换:

def insert_menu():
    with open("index.html") as index:
        index_text = index.read()
    with open("menu.html") as menu:
        menu_text = menu.read()
    # I called it index2 so it doesn't overwrite... you can change that
    with open("index2.html", "w") as index2:
        index2.write(index_text.replace('<!-- Insert Menu here -->', menu_text))

Since the other one does not seem to be working, what about this option (less elegant, admittedly): 由于另一个似乎无效,那么该选项(不太优雅,坦率地说)会如何:

condition = False
new_string = ""
with open("index.html") as f:
    for lines in f:
        if condition == True:
            new_string = new_string + lines + "\n"
        if "<!-- Insert Menu here -->" in lines:
            condition = True

Then you can print new_string into the new file. 然后,您可以将new_string打印到新文件中。

Does this work? 这样行吗?

Trying to update the source file in place will not work. 尝试就地更新源文件将不起作用。 Writing to the "index" while reading it will not do what you think - it will overwrite the lines following your eye-catcher string. 在阅读索引时写入“索引”将不会做您想的-它会覆盖引人注目的字符串后面的行。

Instead, you should treat both the 'index' and the 'menu' source files as inputs and create a third file as an output (you are basically merging the two input files into a combined output file). 相反,您应该将“索引”和“菜单”源文件都视为输入,并创建第三个文件作为输出(基本上是将两个输入文件合并为一个合并的输出文件)。 Using 'classic' syntax: 使用“经典”语法:

output = open('merged.html', 'w')
for line in open('index.html'):
    if '<!-- Insert Menu here -->' in line:
        for menuline in open('menu.html'):
            output.write(menuline)
    else:
        output.write(line)

It's trivial to change that to the "using" syntax if you prefer that. 如果愿意,可以将其更改为“ using”语法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM