简体   繁体   English

Perl正则表达式问题

[英]Perl regular expression problem

I have this conditional in a perl script: 我在perl脚本中有此条件:

if ($lnFea =~ m/^(\d+) qid\:([^\s]+).*?\#docid = ([^\s]+) inc = ([^\s]+) prob = ([^\s]+)$/)

and the $lnFea represents this kind of line: $ lnFea代表这种行:

0 qid:7968 1:0.000000 2:0.000000 3:0.000000 4:0.000000 5:0.000000 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000 11:0.000000 12:0.000000 13:0.000000 14:0.000000 15:0.000000 16:0.005175 17:0.000000 18:0.181818 19:0.000000 20:0.003106 21:0.000000 22:0.000000 23:0.000000 24:0.000000 25:0.000000 26:0.000000 27:0.000000 28:0.000000 29:0.000000 30:0.000000 31:0.000000 32:0.000000 33:0.000000 34:0.000000 35:0.000000 36:0.000000 37:0.000000 38:0.000000 39:0.000000 40:0.000000 41:0.000000 42:0.000000 43:0.055556 44:0.000000 45:0.000000 46:0.000000 #docid = GX000-00-0000000 inc = 1 prob = 0.0214125 0 qid:7968 1:0.000000 2:0.000000 3:0.000000 4:0.000000 5:0.000000 6:0.000000 7:0.000000 8:0.000000 9:0.000000 10:0.000000 11:0.000000 12:0.000000 13:0.000000 14:0.000000 15:0.000000 16 :0.005175 17:0.000000 18:0.181818 19:0.000000 20:0.003106 21:0.000000 22:0.000000 23:0.000000 24:0.000000 25:0.000000 26:0.000000 27:0.000000 28:0.000000 29:0.000000 30:0.000000 31:0.000000 32:0.000000 33:0.000000 34:0.000000 35:0.000000 36:0.000000 37:0.000000 38:0.000000 39:0.000000 40:0.000000 41:0.000000 42:0.000000 43:0.055556 44:0.000000 45:0.000000 46:0.000000 #docid = GX000-00-0000000 inc = 1概率= 0.0214125

The problem is that the if is true on Windows but false on Linux (Fedora 11). 问题是if在Windows上为true,而在Linux(Fedora 11)上为false。 Both systems are using the most recent perl version. 这两个系统都使用最新的perl版本。 So what is the reason of this problem? 那么这个问题的原因是什么呢?

Assuming that $InFea is read from a file, I'd wager that the file is in DOS format. 假设从文件中读取了$InFea ,我敢打赌该文件是DOS格式的。 That would cause the $ anchor to prevent matching on Linux due to differences in the line-endings between those platforms. 由于这些平台之间的行尾存在差异,因此这将导致$锚点无法在Linux上进行匹配。 Perl's automagic newline transformation only works for platform-native text files. Perl的automagic newline转换仅适用于平台本地的文本文件。 If the input file is in DOS format, the Linux box would see an extra carriage return before the end-of-line. 如果输入文件为DOS格式,则Linux框将在行尾之前看到额外的回车符。

It's probably best to convert the input file to the native format for each platform. 对于每个平台,最好将输入文件转换为本机格式。 If that's not possible you should binmode the filehandle (preventing Perl from performing newline transformations) before reading from it and account for the various newline sequences in the regex and anywhere else the data is used. 如果不可能,则应在对文件句柄进行读取之前对文件句柄进行binmode处理(防止Perl执行换行转换),并考虑正则表达式中使用的各种换行序列以及使用数据的任何其他位置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM