简体   繁体   中英

How to modify a text file using Python

I have this following text file:

  1. It's hard to explain puns to kleptomaniacs because they always take things literally.

  2. I used to think the brain was the most important organ. Then I thought, look what's telling me that.

I use the following script to get rid of the numberings and newlines:

import re
with open('jokes.txt', 'r+') as original_file:
    modfile = original_file.read()
    modfile = re.sub("\d+\. ", "", modfile)
    modfile = re.sub("\n", "", modfile)
    original_file.seek(0)
    original_file.truncate()
    original_file.write(modfile)

After running the script, this how my text file is:

It's hard to explain puns to kleptomaniacs because they always take things literally. I used to think the brain was the most important organ. Then I thought, look what's telling me that.

I'd like the file to be:

It's hard to explain puns to kleptomaniacs because they always take things literally.
I used to think the brain was the most important organ. Then I thought, look what's telling me that.

How do I delete the new lines without mending all the lines?

You can use a single replace, with the following regex:

re.sub(r"\d+\. |(?<!^)\n", "", modfile, flags=re.MULTILINE)

(?<!^)\\n will match a newline unless it's at the start of a line. The flag re.MULTILINE makes ^ match every beginning of line.

regex101 demo

In code:

import re
with open('jokes.txt', 'r+') as original_file:
    modfile = original_file.read()
    midfile = re.sub(r"\d+\. |(?<!^)\n", "", modfile, flags=re.MULTILINE)
    original_file.seek(0)
    original_file.truncate()
    original_file.write(modfile)

You can also use a negative lookahead instead of a lookbehind if you want:

r"\d+\. |\n(?!\n)"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM