繁体   English   中英

python从txt文件中删除某些动态行

[英]python remove certain dynamic lines from txt file

我有几个 txt 文件,其数据行结构如下:

文件 1

Header1, xx, yy
Redundant line 1
Redundant line 2
Redundant line 3
Header2, #012345 (random numbers)
data content (to the end of file)

文件 2

Header1, xx, yy
Redundant line 1
Redundant line 2
Redundant line 3
Redundant line 4
Header2, #67891 (random numbers)
data content (to the end of file)

文件 3

Header1, xx, yy
Redundant line 1
Redundant line 2
Header2, #54321 (random numbers)
data content (to the end of file)

预期输出:

对于每个文件,我想删除那些冗余行,只保留 Header1、Header2、#zzzzz 编号的行和以下带有数据内容的行到文件末尾,然后保存到一个新的单独文件,因此每个新文件具有以下数据结构:

Header1, xx, yy
Header2, #zzzzz (keep random numbers from original file)
data content (to the end of file)

我的代码:

我的代码不适用于每个带有动态冗余行的文件,有人可以帮忙提供一些建议,谢谢!

with open('File1.txt') as old, open('new_file1.txt', 'w') as new:
    lines = old.readlines()
    new.writelines(lines[0:1]) #keep Header1
    new.writelines(lines[N:]) #keep Header2 and following data content to the end

您可以使用初始值1定义N变量,并继续将其递增1直到一行与正则表达式.*?,#\\d+ (对于第二个标题)匹配:

import re
with open('File1.txt') as old, open('new_file1.txt', 'w') as new:
    lines = old.readlines()
    new.writelines(lines[:1]) #keep Header1
    N = 1
    while True:
        N += 1
        if re.match(".*?,#\d+", lines[N]):
            break
    new.writelines(lines[N:]) #keep Header2 and following data content to the end

输入文件File1.txt

Header1, xx, yy
Redundant line 1
Redundant line 2
Redundant line 3
Header2, #012345 (random numbers)
data content (to the end of file)

输出文件new_file1.txt

Header1, xx, yy
Header2, #012345 (random numbers)
data content (to the end of file)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM