[英]Perl sort with regular expression
I have a perl array of strings like this: 我有一个像这样的perl数组字符串:
my @arr = ( "gene1 (100)", "gene2 (50)", "gene3 (120)", ... );
How can I sort the array by the integer in parentheses? 如何用括号中的整数对数组进行排序?
Using a transform to compare the first number in the string 使用变换来比较字符串中的第一个数字
use strict;
use warnings;
my @arr = ( "gene1 (100)", "gene2 (50)", "gene3 (120)");
my @sorted = map {$_->[0]}
sort {$a->[1] <=> $b->[1]}
map {[$_, /\b(\d+)\b/]} @arr;
print "$_\n" for @sorted;
Outputs: 输出:
gene2 (50)
gene1 (100)
gene3 (120)
The sort
built-in in Perl lets you pass a code reference as its first argument to define how the sort should be done. Perl中内置的
sort
允许您传递代码引用作为其第一个参数,以定义应如何进行排序。 Inside this code ref, you can use any function you want. 在此代码ref中,您可以使用任何您想要的功能。
Since you want to do it with a regular expression, it makes sense to create a sub
that matches the numbers in the parenthesis and use that in your sorting function. 由于您希望使用正则表达式,因此创建与括号中的数字匹配的
sub
并在排序函数中使用它是有意义的。
You need to call it once for $a
and $b
, the two variables that will be compared to each other for each round of sorting pairs. 您需要为
$a
和$b
调用一次,这两个变量将针对每轮排序对进行相互比较。 You should use the <=>
operator , which is used for sorting numbers in ascending order. 您应该使用
<=>
运算符 ,该运算符用于按升序对数字进行排序。
This is a very verbose version. 这是一个非常详细的版本。
use strict;
use warnings;
use Data::Dump;
my @arr = ( "gene1 (100)", "gene2 (50)", "gene3 (120)", );
dd sort { get_number($a) <=> get_number($b) } @arr;
sub get_number {
my ( $string ) = @_;
return $1 if $string =~ m/\((\d+)\)/;
return 0; # assume it goes last if there is no number
}
Output: 输出:
("gene2 (50)", "gene1 (100)", "gene3 (120)")
This shows the straightforward way. 这显示了直截了当的方式。 The
sort
block sets $aa
and $bb
to the values of the numbers in $a
and $b
respectively. 该
sort
块组$aa
和$bb
到数字的值$a
和$b
分别。 Then <=>
is used to compare them numerically. 然后
<=>
用于在数字上比较它们。
There is no need for the much more obscure transformation method unless the basic technique proves to be too slow. 除非基本技术证明太慢,否则不需要更加模糊的转换方法。
use strict;
use warnings;
use 5.010;
my @arr = ( "gene1 (100)", "gene2 (50)", "gene3 (120)", );
my @sorted = sort {
my ($aa) = $a =~ / \( (\d+) \) /x;
my ($bb) = $b =~ / \( (\d+) \) /x;
$aa <=> $bb;
} @arr;
say for @sorted;
output 产量
gene2 (50)
gene1 (100)
gene3 (120)
The List::UtilsBy
CPAN module provides a function, nsort_by
which sorts a list of values by sorting into numerical order, the values returned by a block of code on each value. List::UtilsBy
CPAN模块提供了一个函数nsort_by
,它通过按数字顺序排序来排序值列表,每个值的代码块返回的值。
In your case, it can be used to extract that number: 在您的情况下,它可用于提取该数字:
use List::UtilsBy 'nsort_by';
@sorted = nsort_by { m/\((\d+)/ and $1 } @strings
This is somewhat more efficient than a regular sort
call with code to extract and compare the two numbers from $a
and $b
directly, as it only has to extract the number from each value once, rather than once for every pair-wise comparison. 这比使用代码直接提取和比较
$a
和$b
的两个数字的常规sort
调用更有效,因为它只需要从每个值中提取一次数,而不是每次成对比较一次。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.