简体   繁体   中英

append lines with “\LF” into one line, until finds “\CR\LF”?

I'm looking for a solution to this case, it's a file cleanup. I have a file "*.csv" that contains several lines, all lines have at the end "\\CR\\LF", sometimes the file comes with broken lines, so at the end only comes "\\LF", missing the "\\CR". I need to put all those lines with only "\\LF" together in one single line, without any empty spaces, that also have "\\CR\\LF" at the end.

For example,

Here's a Python representation of the file's content:

file_content = '''\
"A",B,"C","D"\r\n\
"E",F,"G","H"\r\n\
"I",J\n\
       \n\
             ,"K",    \n\
\n\
"L"\r\n\
"O",P,"Q","R"\r\n\
"S",T,"U","V"\r\n\
'''

Two possible solutions are:

import re

file_content = '''\
"A",B,"C","D"\r\n\
"E",F,"G","H"\r\n\
"I",J\n\
       \n\
             ,"K",    \n\
\n\
"L"\r\n\
"O",P,"Q","R"\r\n\
"S",T,"U","V"\r\n\
'''

print "Original:\n", file_content

replace1 = re.sub("(?<!\r) *\n *", '', file_content)
print "Replace1:\n", replace1

replace2 = re.sub("([^\r])( *\n *)+", '\\1', file_content)
print "Replace2:\n", replace2

The output from that Python 2 script is:

Original:
"A",B,"C","D"
"E",F,"G","H"
"I",J

             ,"K",    

"L"
"O",P,"Q","R"
"S",T,"U","V"

Replace1:
"A",B,"C","D"
"E",F,"G","H"
"I",J,"K","L"
"O",P,"Q","R"
"S",T,"U","V"

Replace2:
"A",B,"C","D"
"E",F,"G","H"
"I",J,"K","L"
"O",P,"Q","R"
"S",T,"U","V"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM