在python中写入文件时如何跳过文本块

Question

Is it possible to use python to skip blocks of text when writing a file from another file?从另一个文件写入文件时，是否可以使用 python 跳过文本块？

For example lets say the input file is:例如，假设输入文件是：

This is the file I would like to write this line
I would like to skip this line
and this one...
and this one...
and this one...
but I want to write this one
and this one...

How can I write a script that allows me to skip certain lines that differ in content and size which resumes writing the lines to another file once it recognizes a certain line?我如何编写一个脚本，允许我跳过某些内容和大小不同的行，一旦它识别出某个行，它就会继续将这些行写入另一个文件？

My code reads through the lines, doesn't write duplicate lines and performs some operation on the line by using dictionaries and regex.我的代码通读行，不写重复行，并使用字典和正则表达式在行上执行一些操作。

Answer 1

def is_wanted(line):
    #
    # You have to define this!
    #
    # return True to keep the line, or False to discard it

def copy_some_lines(infname, outfname, wanted_fn=is_wanted):
    with open(infname) as inf, open(outfname, "w") as outf:
        outf.writelines(line for line in inf if wanted_fn(line))

copy_some_lines("file_a.txt", "some_of_a.txt")

In order to extend this to multi-line blocks, you can implement a finite state machine like为了将其扩展到多行块，您可以实现一个有限状态机，如

在此处输入图像描述

which would turn into something like这会变成类似的东西

class BlockState:
    GOOD_BLOCK = True
    BAD_BLOCK = False

    def __init__(self):
        self.state = self.GOOD_BLOCK

    def is_bad(self, line):
        # *** Implement this! ***
        # return True if line is bad

    def is_good(self, line):
        # *** Implement this! ***
        # return True if line is good

    def __call__(self, line):
        if self.state == self.GOOD_BLOCK:
            if self.is_bad(line):
                self.state = self.BAD_BLOCK
        else:
            if self.is_good(line):
                self.state = self.GOOD_BLOCK
        return self.state

then然后

copy_some_lines("file_a.txt", "some_of_a.txt", BlockState())

Answer 2

Pseudo-code:伪代码：

# Open input and output files, and declare the unwanted function
for line in file1:
    if unwanted(line):
        continue
    file2.write(line)
# Close files etc...

Answer 3

You can read the file line by line, and have control on each line you read:您可以逐行读取文件，并控制您读取的每一行：

with open(<your_file>, 'r') as lines:
    for line in lines:
        # skip this line
        # but not this one

Note that if you want to read all lines despite the content and only then manipulate it, you can:请注意，如果您想阅读所有行而不考虑内容，然后才对其进行操作，您可以：

with open(<your_file>) as fil:
    lines = fil.readlines()

Answer 4

This should work:这应该有效：

SIZE_TO_SKIP = ?
CONTENT_TO_SKIP = "skip it"

with open("my/input/file") as input_file:
    with open("my/output/file",'w') as output_file:
        for line in input_file:
            if len(line)!=SIZE_TO_SKIP and line!=CONTENT_TO_SKIP:
                output_file.write(line)

在python中写入文件时如何跳过文本块

问题描述

4 个解决方案

解决方案1
3 已采纳 2015-01-21 17:10:02

解决方案2
2 2015-01-21 16:55:32

解决方案3
0 2015-01-21 16:56:40

解决方案4
0 2015-01-21 17:03:21

在python中写入文件时如何跳过文本块

问题描述

4 个解决方案

解决方案1 3 已采纳 2015-01-21 17:10:02

解决方案2 2 2015-01-21 16:55:32

解决方案3 0 2015-01-21 16:56:40

解决方案4 0 2015-01-21 17:03:21

解决方案1
3 已采纳 2015-01-21 17:10:02

解决方案2
2 2015-01-21 16:55:32

解决方案3
0 2015-01-21 16:56:40

解决方案4
0 2015-01-21 17:03:21