简体   繁体   中英

Perl \R regex strip Windows newline character

I'm using a Perl script using the following code to remove possible Windows newline characters in input file:

foreach my $line(split /\\r|\\R/)

Executing the same script on two different Linux machines has different results. On machine1 script works as intended, on machine2 every time a capital "R" character is found line is split and result is messed.

I would like to know if the \\R regex is correct and how to make machine2 behave as intended.

In Perl, there are several differences in the way carriage returns can be handled:

\n matches a line-feed (newline) character (ASCII 10)
\r matches a carriage return (ASCII 13)
\R matches any Unicode newline sequence; can be modified using verbs

Windows uses the two characters ASCII 13 + ASCII 10 ( \\r\\n ) and unix uses ASCII 10 ( \\n ). The \\R expression matches any Unicode newline sequence ( \\r , \\n , \\r\\n ).

The likely reason \\R works on one machine and not the other might be differing versions of Perl . \\R was introduced in perl 5.10.0 , so if the other machine is using an older version then updating should solve your issue.

More info :

One of your machines must be using a rather ancient version of Perl.

5.8:

$ perl -wle'print for split /\R/, "QRS\r\nTUV\r\n";'
Unrecognized escape \R passed through at -e line 1.
Q
S
TUV

5.10:

$ perl -wle'print for split /\R/, "QRS\r\nTUV\r\n";'
QRS
TUV

Always use use strict; use warnings; use strict; use warnings; !

Alternatives:

  • split /[\\r\\n]/ . This is equivalent to what you are using, but it's probably buggy.
  • split /\\n|\\r\\n?/ . This is equivalent to split /\\R/ .
  • split /\\r?\\n/ . This matches unix and Windows line endings.
  • split /\\r\\n/ . This matches Windows line endings.

I'd use the second last.

I use Perl nearly every day.

However, if all I have to do is convert line endings, then I use

  • dos2unix to convert to Linux/Unix line endings
  • unix2dos to convert to Windows line endings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM