简体   繁体   English

如何在Perl中正则表达式0或数字

[英]How to regex 0 or digit in perl

I'm tryin to capture the follwoing from the below lines in my file: 我正在尝试从文件的以下几行中捕获以下内容:

1.39223 0.303787
71.9792 0

Input file (example): 输入文件(示例):

XLOC_000559 XLOC_000559 -   S3603:13352-18211   con exp OK  1.39223 0.303787    -2.19627    -1.93877    0.0001  0.0140909   yes
XLOC_001511 XLOC_001511 -   S7778:1319-1421 con exp OK  71.9792 0   -inf    -nan    0.00035 0.0365407   yes

I've tried the regex: 我已经尝试过正则表达式:

my ($con_val, $expt_val) = ($1, $2) if ($_ =~ /OK\t(\d+\.\d+)\t(\d+\.\d+)/);

But its not working on 0 values... 但是它不适用于0值...

Can anyone help please? 有人可以帮忙吗?

There is almost certainly no need to make sure your numbers contain a maximum of one decimal point, and the easiest way to solve this is to use a character class [\\d.] that matches any digit or a dot. 几乎肯定没有必要确保您的数字最多包含一个小数点,而解决此问题的最简单方法是使用与任何数字或点匹配的字符类 [\\d.]

Note that a regex will be applied to $_ unless you say otherwise, so there is no need to write $_ =~ . 请注意,除非另有说明,否则将对$_应用正则表达式,因此无需编写$_ =~

This short program should help you. 这个简短的程序应该会对您有所帮助。

use strict;
use warnings;

while (<DATA>) {
  next unless /OK\s+([\d.]+)\s+([\d.]+)/;
  my ($con_val, $expt_val) = ($1, $2);
  print "$con_val, $expt_val\n";
}

__DATA__
XLOC_000559 XLOC_000559 -   S3603:13352-18211   con exp OK  1.39223 0.303787    -2.19627    -1.93877    0.0001  0.0140909   yes
XLOC_001511 XLOC_001511 -   S7778:1319-1421 con exp OK  71.9792 0   -inf    -nan    0.00035 0.0365407   yes

output 产量

1.39223, 0.303787
71.9792, 0

You have to make the \\.\\d+ optional by wrapping it in parentheses with a ? 您必须通过将\\.\\d+括在括号中以使其为可选来使它成为可选? :

/OK\t(\d+(?:\.\d+)?)\t(\d+(?:\.\d+)?)/

The ?: after the open-paren prevents the regex engine from creating a grouping in the match result. 开括号后的?:阻止正则表达式引擎在匹配结果中创建分组。

use Regexp::Common;
my ($con_val, $expt_val) = /OK\s+ ($RE{num}{real}) \s+ ($RE{num}{real})/x;

or 要么

perl -anE 'say "@F[7,8]"' file

假设第二个值(您要捕获为“ $ expt_val”)后面总是接一个制表符,那么这应该可以工作:

my ($con_val, $expt_val) = ($1, $2) if ($row =~ /OK\t(\d+\.\d+)\t(.+)\t/);

You should use the or operator | 您应该使用or运算符| to specify either: 指定以下任一项:

  • One or more digits (\\d+) followed by a literal . 一个或多个数字(\\d+)后跟一个文字. (\\.) followed by one or more digits (\\d+) (\\.)后跟一个或多个数字(\\d+)

OR 要么

  • A literal 0 文字0

Try this: 尝试这个:

#!/usr/bin/perl
use warnings;
use strict; 
use Data::Dumper;

my @array =('XLOC_000559    XLOC_000559 -   S3603:13352-18211   con exp OK  1.39223 0.303787    -2.19627    -1.93877    0.0001  0.0140909   yes',
    'XLOC_001511    XLOC_001511 -   S7778:1319-1421 con exp OK  71.9792 0   -inf    -nan    0.00035 0.0365407   yes');

foreach (@array){
    my ($con_val, $expt_val) = ($1, $2) if ($_ =~ /OK\t(\d+\.\d+|0)\t(\d+\.\d+|0)/); 
    print "$con_val\t$expt_val\n";
}

Outputs: 输出:

1.39223 0.303787
71.9792 0

Or better yet, assuming your values are separated by a \\t , I'd go for this: 或更妙的是,假设您的值用\\t分隔,我会这样做:

my (@con_val, @expt_val);
foreach (@array){
    my @split = split(/\t/);
    push @con_val, $split[7];
    push @expt_val, $split[8];
}

print Dumper \@expt_val;
print Dumper \@con_val;

Outputs: 输出:

$VAR1 = [
          '0.303787',
          '0'
        ];
$VAR1 = [
          '1.39223',
          '71.9792'
        ];

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM