简体   繁体   English

Perl-将split()的输出分配给哈希片-检测长度不匹配

[英]Perl - assign output of split() to hash slice - detecting length mismatch

I am converting a record of tab-separated values into a hash, as follows: 我将制表符分隔的值的记录转换为哈希,如下所示:

my @field_names = qw(foo bar xyzzy);
my $record = "33\t45\t78\n";
my %feqv_hash;
@feqv_hash{@field_names} = split /\t/, $record;

which creates %feqv_hash : 这将创建%feqv_hash

{ foo => 33, bar => 45, xyzzy => 78 }

I'd like to be able to ensure, as quickly as possible, that $record has the same number of values as @field_names . 我希望能够尽快确保$ record@field_names具有相同数量的值。

This is the best I can come up with: 这是我能想到的最好的方法:

my @field_names = qw(foo bar xyzzy);
my $record = "33\t45\t78\n";
my @field_values = split /\t/, $record;
croak if @field_names != @field_values;
my %feqv_hash;
@feqv_hash{@field_names} = @field_values;

Is there a way that might execute any faster? 有没有一种方法可以更快地执行? (eg not require the temp array @field_values ) (例如,不需要临时数组@field_values

First of all, you want to use -1 instead of 0 for split 's third arg so you don't remove any fields that are present but empty. 首先,您要对split的第三个arg使用-1而不是0 ,这样就不会删除存在但为空的任何字段。

my @field_names = qw(foo bar xyzzy);
my $record = "33\t45\t78\n";
my %feqv_hash;
@feqv_hash{@field_names} = split /\t/, $record, -1;

Let's see how slow the check is. 让我们看看检查有多慢。

use strict;
use warnings;
use Benchmark qw( timethese );
use Carp      qw( croak );

my %tests = (
   without => <<'__EOI__',
      my %feqv_hash;
      @feqv_hash{@field_names} = split /\t/, $record, -1;
__EOI__
   with => <<'__EOI__',
      my @field_values = split /\t/, $record, -1;
      croak if @field_names != @field_values;
      my %feqv_hash;
      @feqv_hash{@field_names} = @field_values;
__EOI__
);    

$_ = 'use strict; use warnings; our @field_names; our $record; '.$_
   for values %tests;

{
   local our @field_names = qw(foo bar xyzzy);
   local our $record = "33\t45\t78\n";
   timethese(-3, \%tests);
}

Results: 结果:

without check: 2.7 microseconds
with check:    4.1 microseconds
               ----------------
check:         1.4 microseconds

The check takes 1.4 microseconds. 检查需要1.4微秒。 I'm not sure why you think there's a problem. 我不确定您为什么会认为有问题。


But it is possible to cut that time almost in half by scanning the string with tr/\\t// . 但是可以通过使用tr/\\t//扫描字符串来将时间缩短近一半。 [Upd : Or by using list assignment in scalar context ] [更新 :或者通过在标量上下文中使用列表分配]

use strict;
use warnings;
use Benchmark qw( cmpthese );
use Carp      qw( croak );

my %tests = (
   temp_array => <<'__EOI__',
      my @field_values = split /\t/, $record, -1;
      croak if @field_names != @field_values;
      my %feqv_hash;
      @feqv_hash{@field_names} = @field_values;
__EOI__
   tr => <<'__EOI__',
      croak if @field_names != 1 + $record =~ tr/\t//;
      my %feqv_hash;
      @feqv_hash{@field_names} = split /\t/, $record, -1;
__EOI__
   aassign => <<'__EOI__',
      my %feqv_hash;
      ( @feqv_hash{@field_names} = split /\t/, $record, -1 ) == @field_names
         or croak;
__EOI__
);    

$_ = 'use strict; use warnings; our @field_names; our $record; '.$_
   for values %tests;

{
   local our @field_names = qw(foo bar xyzzy);
   local our $record = "33\t45\t78\n";
   cmpthese(-3, \%tests);
}

Results: 结果:

               Rate temp_array         tr    aassign
temp_array 233472/s         --       -30%       -36%
tr         334671/s        43%         --        -8%
aassign    362326/s        55%         8%         --

This is certainly premature optimization; 这肯定是过早的优化; write the code how it is most readable to the likely audience, not for some practically unmeasurable performance boost. 编写代码以使其对可能的读者最容易理解,而不是为了提高性能(实际上无法衡量)。

That said, in scalar context, the slice assignment itself (like all list assignments) will return the count of elements on the right, so all you need do is: 也就是说,在标量上下文中,切片分配本身(像所有列表分配一样)将返回右侧的元素计数,因此您需要做的就是:

( @feqv_hash{@field_names} = split /\t/, $record, -1 ) == @field_names
    or croak;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM