python从txt文件中删除某些动态行

Question

我有几个 txt 文件，其数据行结构如下：

文件 1

Header1, xx, yy
Redundant line 1
Redundant line 2
Redundant line 3
Header2, #012345 (random numbers)
data content (to the end of file)

文件 2

Header1, xx, yy
Redundant line 1
Redundant line 2
Redundant line 3
Redundant line 4
Header2, #67891 (random numbers)
data content (to the end of file)

文件 3

Header1, xx, yy
Redundant line 1
Redundant line 2
Header2, #54321 (random numbers)
data content (to the end of file)

预期输出：

对于每个文件，我想删除那些冗余行，只保留 Header1、Header2、#zzzzz 编号的行和以下带有数据内容的行到文件末尾，然后保存到一个新的单独文件，因此每个新文件具有以下数据结构：

Header1, xx, yy
Header2, #zzzzz (keep random numbers from original file)
data content (to the end of file)

我的代码：

我的代码不适用于每个带有动态冗余行的文件，有人可以帮忙提供一些建议，谢谢！

with open('File1.txt') as old, open('new_file1.txt', 'w') as new:
    lines = old.readlines()
    new.writelines(lines[0:1]) #keep Header1
    new.writelines(lines[N:]) #keep Header2 and following data content to the end

Answer 1

您可以使用初始值1定义N变量，并继续将其递增1直到一行与正则表达式.*?,#\\d+ （对于第二个标题）匹配：

import re
with open('File1.txt') as old, open('new_file1.txt', 'w') as new:
    lines = old.readlines()
    new.writelines(lines[:1]) #keep Header1
    N = 1
    while True:
        N += 1
        if re.match(".*?,#\d+", lines[N]):
            break
    new.writelines(lines[N:]) #keep Header2 and following data content to the end

输入文件File1.txt ：

Header1, xx, yy
Redundant line 1
Redundant line 2
Redundant line 3
Header2, #012345 (random numbers)
data content (to the end of file)

输出文件new_file1.txt ：

Header1, xx, yy
Header2, #012345 (random numbers)
data content (to the end of file)

python从txt文件中删除某些动态行

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-13 14:34:28

python从txt文件中删除某些动态行

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-13 14:34:28

解决方案1
1 已采纳 2021-10-13 14:34:28