简体   繁体   中英

Use Python to split text file into chunks by unique character and a blank line at the end

Given a text input file, I want to split it into chunks to contain every line that begins with 'c' with a blank space between each chunk. I have successfully isolated every 'c' line, but how to keep or add a blank line between chunks is eluding me.

Here is the infile:

c70 Title -1 c c
c
c
c
c heading1 heading2 heading3 heading4
data data data data

c80 Title -2 c c
c
c
c
c heading1 heading2 heading3 heading4
data data data data

c90 Title -3 c c
c
c
c
c heading1 heading2 heading3 heading4
data data data data

This is my code:

 for line in infile: if not line.lstrip().startswith('c'): copy = True continue elif line == '\n': copy = True continue elif copy: outfile.write(line)

This is my outfile:

c70 Title -1 c c
c
c
c
c heading1 heading2 heading3 heading4
c80 Title -2 c c
c
c
c
c heading1 heading2 heading3 heading4
c90 Title -3 c c
c
c
c
c heading1 heading2 heading3 heading4

This is my desired outfile:

c70 Title -1 c c
c
c
c
c heading1 heading2 heading3 heading4

c80 Title -2 c c
c
c
c
c heading1 heading2 heading3 heading4

c90 Title -3 c c
c
c
c
c heading1 heading2 heading3 heading4

The only difference between my current outfile and my desired outfile is to to keep the existing blank line or add a blank line between chunks.

I believe this should do what you are hoping for:

for line in infile:
    if line.lstrip().startswith("c") or line == "\n":
        outfile.write(line)

You can scan each line, and check if it starts with a c or if it is a newline, and only write those to the output file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM