How to delete CRLFs on a specific line with a loop in Python

Question

I have a file like this:

"Headers1"|"Headers2"|"Headers3"|"Headers4"(CR LF)
"Line1"|"Line1"|"Line1"|"Line1"(CR LF)
"Line2"|"Line2"|"Line2"|"Line2"(CR LF)
"Line3"|"Line3Column2 (CR LF)
Line3Column2 (CR LF)"|"Line3 Column3 (CR LF)
Line3 Column3"|"Line2"(CR LF)
"Line4"|"Line4"|"Line4"|"Line4"(CR LF)

I want to delete the CRLF on line 3, 4 to have the line 5 on the line 4 and the line 4 on the line 3.

BUT i don't want to delete the CRLF on line 5 otherwise my line 3 and my line 4 will be on the same line...

Finally i want:

"Headers1"|"Headers2"|"Headers3"|"Headers4"(CR LF)
"Line1"|"Line1"|"Line1"|"Line1"(CR LF)
"Line2"|"Line2"|"Line2"|"Line2"(CR LF)
"Line3"|"Line3Column2 (̶C̶R̶L̶F̶) Line3Column2 (̶C̶R̶L̶F̶) "|"Line3 Column3 (̶C̶R̶L̶F̶) Line3 Column3"|"Line3"(CR LF)
"Line4"|"Line4"|"Line4"|"Line4"(CR LF)

I tried to make a loop (when i have a line with a pipe number < header then i delete his CRLF and then i restart from the beginning) but it doesn't work...

Answer 1

You could use something like this:

def clean_up(lines):
    clean_lines = []
    for line in lines:
        cols = line.split('|')
        replaced_except_last = (col.replace('\r\n', '') for col in cols[:-1])
        cols[:-1] = replaced_except_last
        # join them back or do anything you want
        clean_lines.append('|'.join(cols))
    return clean_lines

The idea is that we replace the unneeded CRLF in all columns except the last one.

Optionally, you could make this a generator too, which could help if your file is very large and you don't feel like loading it into memory an extra time.

def clean_up(lines):
   for line in lines:
       cols = line.split('|')
       cols[:-1] = (col.replace('\r\n', '') for col in cols[:-1])
       yield '|'.join(cols) # if you want to get the list, "yield cols"

and then you use it like

with open('filename') as file:
    # the file will be passed in line-by-line
    for clean_line in clean_up(file):
        print(clean_line)

Answer 2

You could use csv module to read rows and do anything you want then write it back.

import csv

with open('old_file.csv') as f:
    reader = csv.DictReader(f, delimiter='|')
    with open('new_file.csv', 'w') as wf:
        writer = csv.DictWriter(wf, reader.fieldnames, delimiter='|')
        for row in reader:
            writer.writerow({h: v.replace('\r\n', '') for h, v in row.items()})

How to delete CRLFs on a specific line with a loop in Python

Question

2 answers

solution1
0 2020-05-29 09:09:52

solution2
0 2020-05-29 09:21:29

How to delete CRLFs on a specific line with a loop in Python

Question

2 answers

solution1 0 2020-05-29 09:09:52

solution2 0 2020-05-29 09:21:29

solution1
0 2020-05-29 09:09:52

solution2
0 2020-05-29 09:21:29