I'm looking for a solution to this case, it's a file cleanup. I have a file "*.csv" that contains several lines, all lines have at the end "\\CR\\LF", sometimes the file comes with broken lines, so at the end only comes "\\LF", missing the "\\CR". I need to put all those lines with only "\\LF" together in one single line, without any empty spaces, that also have "\\CR\\LF" at the end.
For example,
Here's a Python representation of the file's content:
file_content = '''\
"A",B,"C","D"\r\n\
"E",F,"G","H"\r\n\
"I",J\n\
\n\
,"K", \n\
\n\
"L"\r\n\
"O",P,"Q","R"\r\n\
"S",T,"U","V"\r\n\
'''
Two possible solutions are:
import re
file_content = '''\
"A",B,"C","D"\r\n\
"E",F,"G","H"\r\n\
"I",J\n\
\n\
,"K", \n\
\n\
"L"\r\n\
"O",P,"Q","R"\r\n\
"S",T,"U","V"\r\n\
'''
print "Original:\n", file_content
replace1 = re.sub("(?<!\r) *\n *", '', file_content)
print "Replace1:\n", replace1
replace2 = re.sub("([^\r])( *\n *)+", '\\1', file_content)
print "Replace2:\n", replace2
The output from that Python 2 script is:
Original:
"A",B,"C","D"
"E",F,"G","H"
"I",J
,"K",
"L"
"O",P,"Q","R"
"S",T,"U","V"
Replace1:
"A",B,"C","D"
"E",F,"G","H"
"I",J,"K","L"
"O",P,"Q","R"
"S",T,"U","V"
Replace2:
"A",B,"C","D"
"E",F,"G","H"
"I",J,"K","L"
"O",P,"Q","R"
"S",T,"U","V"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.