Perl \\ R regex剥离Windows换行符

Question

I'm using a Perl script using the following code to remove possible Windows newline characters in input file: 我正在使用Perl脚本使用以下代码删除输入文件中可能的Windows换行符：

foreach my $line(split /\\r|\\R/)

Executing the same script on two different Linux machines has different results. 在两台不同的Linux机器上执行相同的脚本会产生不同的结果。 On machine1 script works as intended, on machine2 every time a capital "R" character is found line is split and result is messed. 在machine1脚本按预期工作，在每次发现大写“R”字符时在machine2上，行被分割并且结果被混乱。

I would like to know if the \\R regex is correct and how to make machine2 behave as intended. 我想知道\\R正则表达式是否正确以及如何使machine2按预期运行。

Answer 1

In Perl, there are several differences in the way carriage returns can be handled: 在Perl中，可以处理回车的方式有几个不同之处：

\n matches a line-feed (newline) character (ASCII 10)
\r matches a carriage return (ASCII 13)
\R matches any Unicode newline sequence; can be modified using verbs

Windows uses the two characters ASCII 13 + ASCII 10 ( \\r\\n ) and unix uses ASCII 10 ( \\n ). Windows使用两个字符ASCII 13 + ASCII 10 （ \\r\\n ），unix使用ASCII 10 （ \\n ）。 The \\R expression matches any Unicode newline sequence ( \\r , \\n , \\r\\n ). \\R表达式匹配任何Unicode换行符序列（ \\r \\n ， \\r\\n \\n ， \\r\\n \\n \\r\\n ）。

The likely reason \\R works on one machine and not the other might be differing versions of Perl . 可能的原因是\\R在一台机器上工作而不是另一台机器可能是不同版本的Perl 。 \\R was introduced in perl 5.10.0 , so if the other machine is using an older version then updating should solve your issue. \\R是在perl 5.10.0中引入的，所以如果其他机器使用的是旧版本，则更新应解决您的问题。

More info : 更多信息 ：

Answer 2

One of your machines must be using a rather ancient version of Perl. 你的一台机器必须使用相当古老的Perl版本。

5.8: 5.8：

$ perl -wle'print for split /\R/, "QRS\r\nTUV\r\n";'
Unrecognized escape \R passed through at -e line 1.
Q
S
TUV

5.10: 5.10：

$ perl -wle'print for split /\R/, "QRS\r\nTUV\r\n";'
QRS
TUV

Always use use strict; use warnings; 始终use strict; use warnings; use strict; use warnings; ! ！

Alternatives: 备择方案：

split /[\\r\\n]/ . split /[\\r\\n]/ 。 This is equivalent to what you are using, but it's probably buggy. 这相当于你正在使用的东西，但它可能是错误的。
split /\\n|\\r\\n?/ . split /\\n|\\r\\n?/ 。 This is equivalent to split /\\R/ . 这相当于split /\\R/ 。
split /\\r?\\n/ . split /\\r?\\n/ 。 This matches unix and Windows line endings. 这与unix和Windows行结尾相匹配。
split /\\r\\n/ . split /\\r\\n/ 。 This matches Windows line endings. 这与Windows行结尾相匹配。

I'd use the second last. 我会用倒数第二个。

Answer 3

I use Perl nearly every day. 我几乎每天都使用Perl。

However, if all I have to do is convert line endings, then I use 但是，如果我所要做的就是转换行结尾，那么我就用了

dos2unix to convert to Linux/Unix line endings dos2unix转换为Linux / Unix行结尾
unix2dos to convert to Windows line endings. unix2dos转换为Windows行结尾。

Perl \\ R regex剥离Windows换行符

问题描述

3 个解决方案

解决方案1
6 2015-06-04 15:04:01

解决方案2
3 2015-06-04 15:06:31

解决方案3
0 2016-08-05 14:33:49

Perl \\ R regex剥离Windows换行符

问题描述

3 个解决方案

解决方案1 6 2015-06-04 15:04:01

解决方案2 3 2015-06-04 15:06:31

解决方案3 0 2016-08-05 14:33:49

解决方案1
6 2015-06-04 15:04:01

解决方案2
3 2015-06-04 15:06:31

解决方案3
0 2016-08-05 14:33:49