Python如何删除文本文件中特定字符串之后或之前的特定行数

Question

All I can find is how to delete all lines after a specific word.我所能找到的就是如何删除特定单词后的所有行。 But I only want a specific amount of deleted lines.但我只想要特定数量的删除行。

For example I have a file that contains:例如，我有一个包含以下内容的文件：

FCT
Paris
105,4
35
2,161 million
LZQ
London
1572
11
8,982 million
PRI
Paris
105,4
35
2,161 million
Rome
1285
11
2,873 million
PRI
Paris
105,4
35
2,161 million

And now I want to delete 3 lines after Paris, the line before Paris and the line containing Paris itself.现在我想删除巴黎之后的 3 行，巴黎之前的行和包含巴黎本身的行。

Expected output would be:预期输出将是：

LZQ
London
1572
11
8,982 million

What works to delete only Paris:只删除巴黎的方法：

bad_words = ['Paris',]

with open('DataSystem.txt') as oldfile, open('newfile.txt', 'w') as newfile:
for line in oldfile:
    if not any(bad_word in line for bad_word in bad_words):
        newfile.write(line)

Answer 1

This is pretty inelegant but it works, assuming you want to remove exactly one previous line and exactly three following lines if a "bad word" is encountered.这很不优雅，但它有效，假设您想在遇到“坏词”时正好删除前一行和后三行。 It will not work as intended if there are sometimes more lines or fewer lines following a "bad word":如果有时在“坏词”后面有更多行或更少行，它将无法按预期工作：

bad_words = {"Paris"}  # membership tests with sets are O(1)


with open('DataSystem.txt') as oldfile:
    data = oldfile.read().split("\n")


i = 0
new_data = []
while i < len(data):
    item = data[i]
    if item in bad_words:
        del new_data[-1]
        i += 4
        continue
    new_data.append(item)
    i += 1

Output:输出：

['LZQ',
 'London',
 '1572',
 '11',
 '8,982 million',
 'Rome',
 '1285',
 '11',
 '2,873 million']

You can then write this to your newfile :然后您可以将其写入您的newfile ：

with open('newfile.txt', 'w') as newfile:
    newfile.write("\n".join(new_data))

Answer 2

This does just what I described.这正是我所描述的。 Read the file in 5 lines at a time.一次读取 5 行文件。 If no "bad word" is found in line 2, write those 5 lines out.如果在第 2 行中没有发现“坏词”，请写出这 5 行。

bad_words = ['Paris']

with open('DataSystem.txt') as oldfile, open('newfile.txt', 'w') as newfile:
    while True:
        lines = [oldfile.readline() for _ in range(5)]
        if not lines[0]:
            break
        if lines[1].rstrip() not in bad_words:
            newfile.write( ''.join(lines) )

Answer 3

Since the end of data must contain million , you can try this code.由于数据的末尾必须包含million ，您可以尝试此代码。

example code:示例代码：

bad_words = ['Paris',]

with open('DataSystem.txt') as oldfile, open('newfile.txt', 'w') as newfile:
    lines = oldfile.readlines()
    temp = []
    is_bad = False
    for line in lines:
        temp.append(line)
        for bad_word in bad_words:
            if bad_word in line:
                is_bad = True
                break
        if "million" in line:
            if not is_bad:
                for new_data in temp:
                    newfile.write(new_data)
            is_bad = False
            temp = []

result:结果：

LZQ
London
1572
11
8,982 million
Rome
1285
11
2,873 million

Python如何删除文本文件中特定字符串之后或之前的特定行数

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-10-20 03:58:48

解决方案2
0 2021-10-20 03:51:40

解决方案3
0 2021-10-20 04:00:17

example code:示例代码：

result:结果：

Python如何删除文本文件中特定字符串之后或之前的特定行数

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-10-20 03:58:48

解决方案2 0 2021-10-20 03:51:40

解决方案3 0 2021-10-20 04:00:17

example code:示例代码：

result:结果：

解决方案1
1 已采纳 2021-10-20 03:58:48

解决方案2
0 2021-10-20 03:51:40

解决方案3
0 2021-10-20 04:00:17