简体   繁体   English

写入文件时尝试跳过与正则表达式匹配的行,但新文件有额外的新行

[英]Trying to skip lines that match regex when writing to file, but new file has extra new lines

This is an spin-off on a question I asked here .这是我在这里提出的一个问题的衍生产品。

I'm trying to setup a method that can edit text files, based on the input dictionary.我正在尝试根据输入字典设置一种可以编辑文本文件的方法。 This is what I have so far:这是我到目前为止所拥有的:

info = {'#check here 1':{'action':'read'}, '#check here 2':{'action':'delete'}}

search_pattern = re.compile(r'.*(#.+)')        

    with open(input_file_name, "r") as old_file, open(output_file_name, "w+") as new_file:
        lines = old_file.readlines()

        for line in lines:
            edit_point = search_pattern.search(line)
            if edit_point:
                result = edit_point.group(1)
                if result in info and info[result]["action"] == "insert":#insert new lines to file
                    print("insert information to file")
                    new_file.write("\n".join([str(n) for n in info[result]["new_lines"]]))
                    new_file.write(result)
                elif result in info and info[result]["action"] == "delete":#skip lines with delete action
                    print("found deletion point. skipping line")
                else:#write to file any line with a comment that is not in info
                    new_file.write(line)
            else:#write lines that do not match regex for (#.*)
                new_file.write(line)

Basically, when you submit the dictionary, the program will iterate through the file, searching for comments.基本上,当您提交字典时,程序将遍历文件,搜索评论。 If the comment is in the dictionary, it will check the corresponding action.如果评论在字典中,它将检查相应的操作。 If the action is to insert, it will write the lines to the file.如果操作是插入,它会将行写入文件。 If it is delete, it will skip that line.如果它被删除,它将跳过该行。 Any line that does not have a comment should be written to the new file.任何没有注释的行都应该写入新文件。

My problem is that when I delete a line from the file, it appears that there is extra new lines where they used to be.我的问题是,当我从文件中删除一行时,它们曾经所在的位置似乎有额外的新行。 For example, if I have a list:例如,如果我有一个列表:

hello world

how are you #keep this
I'm fine #check here 2
whats up

I expect the output to be:我希望 output 是:

hello world

how are you #keep this
whats up

But I instead have a blank line there:但我在那里有一个空行:

hello world

how are you #check here 2

whats up

I suspect that it is my final else statement, which write to the file any line that does not match edit_point, in this case new lines.我怀疑这是我最后的 else 语句,它将任何与 edit_point 不匹配的行写入文件,在本例中为新行。 However, my understanding is that the for loop should go line by line, and simply go that line.但是,我的理解是 for 循环应该是 go 一行一行,而只是 go 那一行。 Can anyone tell me what I'm missing here?谁能告诉我我在这里缺少什么?

That looks a little tangled, you're mixing the reading and writing logic with the processing logic which makes it difficult to keep track of what's going on.这看起来有点纠结,您将读取和写入逻辑与处理逻辑混合在一起,这使得很难跟踪正在发生的事情。 Try this approach instead:试试这种方法:

from enum import Enum
from typing import Dict, List


class Action(Enum):
    KEEP = "keep"
    REMOVE = "remove"


definition = {
    "#KEEP": {"action": Action.KEEP},
    "#REMOVE": {"action": Action.REMOVE},
}


def clean_comments(
    lines: List[str], definition: Dict[str, Dict[str, str]]
) -> List[str]:

    # Keep a list of the lines that should be in the output
    output: List[str] = []

    # Loop the lines
    for line in lines:

        # If any of the comments in the definition is found, process further
        if any([comment in line for comment in definition.keys()]):

            # Figure out what to do
            for comment, details in definition.items():
                if comment in line:

                    if details["action"] == Action.KEEP:
                        output.append(line)
                        break

                    elif details["action"] == Action.REMOVE:
                        break

        # Keep all other lines
        else:
            output.append(line)

    return output


# Your data here...
with open("test_input.txt", "r") as f:
    lines = f.readlines()

# Use the function to clean the text
clean_text = "".join(clean_comments(lines, definition))

# Show the output
print(clean_text)

# Write to file
with open("test.txt", "w") as f:
    f.write(clean_text)

Output: Output:

hello world

how are you #KEEP: This line will be kept in the output file
whats up

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM