简体   繁体   English

如何在Perl中创建唯一标识符?

[英]How can I create unique identifiers in Perl?

I am creating a file-oriented database of some test results performed by various users. 我正在创建一个面向文件的数据库,其中包含各种用户执行的一些测试结果。 For this I need to generate unique id for every entry in the database. 为此,我需要为数据库中的每个条目生成唯一的id。 The ids must satisfy following requirements: ID必须满足以下要求:

  • Ids should be fairly small (6 characters at most) Ids应该相当小(最多6个字符)
  • For every test case and user combination each time same id should be generated 对于每个测试用例和用户组合,每次应生成相同的id

What I tried was a simple BKDR hash function with seed value 31 and used ord() function as follows: 我尝试的是一个简单的BKDR哈希函数,其种子值为31,并使用了ord()函数,如下所示:

@chars = split(//,$hash_var);

$hash = 0;
$seed = 31;

foreach $char ( @chars ) {
   if( $char !~ m/\d/ ) {
       $hash = ( $seed * $hash ) + ord( $char );
   }  
   else {
       $hash = ( $seed * $hash ) + $char ;
   }
}

$hash = ( $hash & 0x7FFFFFFF ) % 1000;
$hash = "$chars[0]$chars[$#chars]$hash" ;

This sometimes leads to same results for various combinations ie uniqueness is not observed. 这有时会导致各种组合的相同结果,即未观察到唯一性。 Is their any other way to accomplish this? 他们是否有其他方法可以实现这一目标? Does changing seed value help accomplish uniqueness. 改变种子价值有助于实现独特性。

Do you have more than 256 users and/or more than 65536 test cases per user? 每个用户有超过256个用户和/或超过65536个测试用例吗? If not, you can just index users from 0 .. 255 and test cases from 0 .. 65535 and encode it as a string of hexadecimal digits so six characters would be fine. 如果没有,你可以只从0 ... 255索引用户并测试0到65535的情况,并将其编码为十六进制数字字符串,这样六个字符就可以了。

If you have more users or test cases than that, I would again index the users and test cases and then combine them into a 32-bit integer which would actually only take 4 bytes and be trivial to implement but slightly harder on humans. 如果你有更多的用户或测试用例,我会再次索引用户和测试用例,然后将它们组合成一个32位整数,实际上只需要4个字节,并且可以实现,但对人类来说稍微有点难度。

In any case, I am assuming you are given user name and test case information. 无论如何,我假设您获得了用户名和测试用例信息。 Just keep two tied hashes: %users and %cases to map users and test cases to their index numbers. 只需保留两个绑定的哈希值: %users%cases将map用户和测试用例映射到索引号。

Part of your problem may be that you are using floating point math and BKDR is almost certainly wanting integer math. 您的部分问题可能是您正在使用浮点数学,而BKDR几乎肯定需要整数数学。 You can fix that bug by saying 您可以通过说明修复该错误

my @chars = split(//,$hash_var);

my $hash = 0;
my $seed = 31;

for my $char ( @chars ) {
   use integer;
   if( $char !~ m/\d/ ) {
       $hash = ( $seed * $hash ) + ord( $char );
   }  
   else {
       $hash = ( $seed * $hash ) + $char ;
   }
}

$hash = ( $hash & 0x7FFFFFFF ) % 1000;
$hash = "$chars[0]$chars[$#chars]$hash" ;

Another tweak that might help is using characters other than the first and last. 另一个可能有用的调整是使用除第一个和最后一个之外的字符。 If the first and last characters tend to be the same, they add no uniqueness to the hash. 如果第一个和最后一个字符趋于相同,则它们不会为哈希添加唯一性。

You may also want to use a better hash function like MD5 (available in Digest::MD5) and trim the result to your desired size. 您可能还想使用更好的哈希函数,如MD5(在Digest :: MD5中可用),并将结果修剪为所需的大小。 However, the fact that you are using a hash at all means that you run the risk of having a collision. 但是,您使用哈希的事实意味着您有可能发生冲突。

If you don't have a lot of users/testcases a simple solution like this might be enough. 如果您没有很多用户/测试用例,这样的简单解决方案就足够了。 You'd have to add the limit (and probably pack the integer when storing it). 您必须添加限制(并且可能在存储时包装整数)。

vinko@parrot:~# more hash.pl
use strict;
use warnings;

my %hash;
my $count = 0;

sub getUniqueId {

        my $_user = shift;
        my $_test = shift;
        my $val;

        my $key = $_user."|".$_test;
        if (defined $hash{$key}) {
                $val = $hash{$key};
        } else {
                $hash{$key} = $count;
                $val = $count;
                $count = $count + 1;
        }
        return $val;
}

my @users = qw{ user1 user2 user3 user4 user5 user3 user5 };
my @testcases = qw{ test1 test2 test3 test1 test1 };

for my $user (@users) {
        for my $test (@testcases) {
                print "$user $test: ".getUniqueId($user,$test)."\n";
        }
}
vinko@parrot:~# perl hash.pl
user1 test1: 0
user1 test2: 1
user1 test3: 2
user1 test1: 0
user1 test1: 0
user2 test1: 3
user2 test2: 4
user2 test3: 5
user2 test1: 3
user2 test1: 3
user3 test1: 6
user3 test2: 7
user3 test3: 8
user3 test1: 6
user3 test1: 6
user4 test1: 9
user4 test2: 10
user4 test3: 11
user4 test1: 9
user4 test1: 9
user5 test1: 12
user5 test2: 13
user5 test3: 14
user5 test1: 12
user5 test1: 12
user3 test1: 6
user3 test2: 7
user3 test3: 8
user3 test1: 6
user3 test1: 6
user5 test1: 12
user5 test2: 13
user5 test3: 14
user5 test1: 12
user5 test1: 12

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM