I have a text file containing the following lines for example:
I want to remove the CR LF only when the previous line contains a CR at the end. Via regex I can remove oall characters, but I can't build the condition to check the previous line.
Can anyone help me?
You could replace
\r(?= *\r)
with nothing.
It simply matches CR
's followed by CR
's (optionally preceded by space). The actual match is only the first CR
and the rest is done by a look-ahead, so replacing only replaces the CR
that's missing it's LF
.
It's a slight variation of what you're asking for, because instead of removing the CRLF
, it removes the single CR
's, which will make a more uniform file with all CRLF
endings, instead of a mix of single CR
and CRLF
end of lines .
Ie
Regex re = new Regex("\r(?= *\r)");
string sResult = re.Replace( sInput, "").ToString() );
Edit
Thinking of it, my solution will leave spaces from the beginning of a line following a CR
-only line, at the end of it. The solution you describe will leave'm in the beginning of the next. I'm guessing the preferred would be to remove them. For this, change the RE to
\r *(?=\r)
making it match the spaces as well, making the replace remove them.
Used the following code to archive this:
String strFile = File.ReadAllText(@file, Encoding.Default);
Regex re = new Regex("\r(?= *\r)");
strFile = re.Replace(strFile, "");
File.WriteAllText(@file + ".tmp", strFile);
You may use
(\r)[\p{Zs}\t]*\r\n
and replace with $1
.
Details
(\\r)
- Group 1: a CR [\\p{Zs}\\t]*
- followed with 0+ horizontal whitespaces \\r\\n
- and a CRLF. Replacement is the CR captured into Group 1. See a C# demo :
var s = " Line 1\r \r\n Line 2\r\n \r\n more text";
Console.WriteLine(Regex.Replace(s, @"(\r)[\p{Zs}\t]*\r\n", "$1")
.Replace("\r", "<CR>").Replace("\n", "<LF>"));
// => Line 1<CR> Line 2<CR><LF> <CR><LF> more text
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.