I know CR LF (\\r\\n) would be interpreted as two characters, "carriage return" + "new line", but how would that affect different programs when it's for example, a source code--
As it is a sequence of whitespace characters, CRLF is ignored in C, but not in Bash:
If the first line of a bash script ( #!/bin/bash
) has a CRLF line terminator, the script won't run. It will be looking for the file /bin/bash\\r
, which doesn't exist.
If any of the other lines of a script have a CRLF line terminator, the command on that line will either be not found (as bash is looking for a command named some_command\\r
), or will be passed a \\r
at the end of its last parameter.
The shell does not treat CR as white space by default.
Source code ( crlf67.sh
) with CR marked by ^M
:
#!/bin/sh^M
^M
echo "Hello^M
World!"^M
Running the command explicitly:
$ sh crlf67.sh
: command not found
Hello
World!
$ sh crlf67.sh 2>&1 | vis -r
crlf67.sh: line 2: ^M: command not found
Hello^M
World!^M
$
(The vis
command is an extended version of the vis
program from Brian W Kernighan, Rob Pike The Unix Programming Environment (Nov 1983). It makes non-printing characters visible.)
If you make the script executable:
$ make crlf67
cat crlf67.sh >crlf67
chmod a+x crlf67
$ crlf67
-bash: ./crlf67: /bin/sh^M: bad interpreter: No such file or directory
$
The kernel doesn't treat the CR as white space either and fails to find the command.
In C source code, officially, you can't use backslash to continue lines in C if the line ending is CRLF because the character after the backslash isn't a newline (NL or LF); it's a CR. Some compilers will ignore white space (at least the CR) after the last backslash on a line — GCC 9.1.0 for one, but also earlier versions. It warns about spaces after a trailing backslash (unless you use -Werror
as I do; then it's an error). It isn't what the standard stipulates; however, even -pedantic
doesn't stop it ignoring the erroneous notation.
Source code ( crlf19.c
) with CR marked by ^M
and newline marked by ^J
:
#include <stdio.h>^M^J
^M^J
int main(void)^M^J
{^M^J
printf("Hello\ ^M^J
world!\ ^M^J
\n");^M^J
return 0;^M^J
}^M^J
Compilation by GCC 9.1.0 on macOS 10.14.5 Mojave:
$ gcc -O3 -g -std=c11 -Wall -Wextra -pedantic crlf19.c -o crlf19
crlf19.c: In function ‘main’:
crlf19.c:5:18: warning: backslash and newline separated by space
5 | printf("Hello\
|
crlf19.c:6:8: warning: backslash and newline separated by space
6 | world!\
|
$ gcc -O3 -g -std=c11 -Wall -Wextra -Werror crlf19.c -o crlf19
crlf19.c: In function ‘main’:
crlf19.c:5:18: error: backslash and newline separated by space [-Werror]
5 | printf("Hello\
|
crlf19.c:6:8: error: backslash and newline separated by space [-Werror]
6 | world!\
|
cc1: all warnings being treated as errors
$
This behaviour goes back at least as far as GCC 4.1.2 — that version was tested on an ante-diluvian RHEL 5 box.
If you remove the spaces after the backslash leaving just the CRLF line endings, GCC doesn't complain at all.
It depends on the program that's processing the file. I don't believe there's any general rule.
For example, I just created several shell scripts in an otherwise empty directory. One of them is named some_command
with an ASCII CR as the last character of the file name.
I can invoke that command from a shell script by including that CR as part of the command name. The shell (sh, bash, or ksh) doesn't treat the CR character as white space.
$ ls -l
total 16
-rwxr-xr-x 1 kst kst 26 Jul 1 16:46 crlf.bash
-rwxr-xr-x 1 kst kst 25 Jul 1 16:46 crlf.ksh
-rwxr-xr-x 1 kst kst 24 Jul 1 16:46 crlf.sh
-rwxr-xr-x 1 kst kst 21 Jul 1 16:49 'some_command'$'\r'
$ cat -v crlf.bash
#!/bin/bash
some_command^M
$ cat -v crlf.ksh
#!/bin/ksh
some_command^M
$ cat -v crlf.sh
#!/bin/sh
some_command^M
$ cat -v some_command
#!/bin/sh
echo hello
$ ./crlf.bash
Hello
$ ./crlf.ksh
Hello
$ ./crlf.sh
Hello
$
The version of ls
I'm using (GNU coreutils 8.28) has a special syntax for showing file names that contain special characters. cat -v
shows CR characters as ^M
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.