简体   繁体   English

Perl \\ R regex剥离Windows换行符

[英]Perl \R regex strip Windows newline character

I'm using a Perl script using the following code to remove possible Windows newline characters in input file: 我正在使用Perl脚本使用以下代码删除输入文件中可能的Windows换行符:

foreach my $line(split /\\r|\\R/)

Executing the same script on two different Linux machines has different results. 在两台不同的Linux机器上执行相同的脚本会产生不同的结果。 On machine1 script works as intended, on machine2 every time a capital "R" character is found line is split and result is messed. 在machine1脚本按预期工作,在每次发现大写“R”字符时在machine2上,行被分割并且结果被混乱。

I would like to know if the \\R regex is correct and how to make machine2 behave as intended. 我想知道\\R正则表达式是否正确以及如何使machine2按预期运行。

In Perl, there are several differences in the way carriage returns can be handled: 在Perl中,可以处理回车的方式有几个不同之处:

\n matches a line-feed (newline) character (ASCII 10)
\r matches a carriage return (ASCII 13)
\R matches any Unicode newline sequence; can be modified using verbs

Windows uses the two characters ASCII 13 + ASCII 10 ( \\r\\n ) and unix uses ASCII 10 ( \\n ). Windows使用两个字符ASCII 13 + ASCII 10\\r\\n ),unix使用ASCII 10\\n )。 The \\R expression matches any Unicode newline sequence ( \\r , \\n , \\r\\n ). \\R表达式匹配任何Unicode换行符序列( \\r \\n\\r\\n \\n\\r\\n \\n \\r\\n )。

The likely reason \\R works on one machine and not the other might be differing versions of Perl . 可能的原因是\\R在一台机器上工作而不是另一台机器可能是不同版本的Perl \\R was introduced in perl 5.10.0 , so if the other machine is using an older version then updating should solve your issue. \\R是在perl 5.10.0中引入的,所以如果其他机器使用的是旧版本,则更新应解决您的问题。

More info : 更多信息

One of your machines must be using a rather ancient version of Perl. 你的一台机器必须使用相当古老的Perl版本。

5.8: 5.8:

$ perl -wle'print for split /\R/, "QRS\r\nTUV\r\n";'
Unrecognized escape \R passed through at -e line 1.
Q
S
TUV

5.10: 5.10:

$ perl -wle'print for split /\R/, "QRS\r\nTUV\r\n";'
QRS
TUV

Always use use strict; use warnings; 始终use strict; use warnings; use strict; use warnings; !

Alternatives: 备择方案:

  • split /[\\r\\n]/ . split /[\\r\\n]/ This is equivalent to what you are using, but it's probably buggy. 这相当于你正在使用的东西,但它可能是错误的。
  • split /\\n|\\r\\n?/ . split /\\n|\\r\\n?/ This is equivalent to split /\\R/ . 这相当于split /\\R/
  • split /\\r?\\n/ . split /\\r?\\n/ This matches unix and Windows line endings. 这与unix和Windows行结尾相匹配。
  • split /\\r\\n/ . split /\\r\\n/ This matches Windows line endings. 这与Windows行结尾相匹配。

I'd use the second last. 我会用倒数第二个。

I use Perl nearly every day. 我几乎每天都使用Perl。

However, if all I have to do is convert line endings, then I use 但是,如果我所要做的就是转换行结尾,那么我就用了

  • dos2unix to convert to Linux/Unix line endings dos2unix转换为Linux / Unix行结尾
  • unix2dos to convert to Windows line endings. unix2dos转换为Windows行结尾。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM