[英]How can I merge several hashes into one hash in Perl?
In Perl, how do I get this: 在Perl中,我该如何获得:
$VAR1 = { '999' => { '998' => [ '908', '906', '0', '998', '907' ] } };
$VAR1 = { '999' => { '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] } };
$VAR1 = { '999' => { '996' => [] } };
$VAR1 = { '999' => { '995' => [] } };
$VAR1 = { '999' => { '994' => [] } };
$VAR1 = { '999' => { '993' => [] } };
$VAR1 = { '999' => { '997' => [ '986', '987', '990', '984', '989', '988' ] } };
$VAR1 = { '995' => { '101' => [] } };
$VAR1 = { '995' => { '102' => [] } };
$VAR1 = { '995' => { '103' => [] } };
$VAR1 = { '995' => { '104' => [] } };
$VAR1 = { '995' => { '105' => [] } };
$VAR1 = { '995' => { '106' => [] } };
$VAR1 = { '995' => { '107' => [] } };
$VAR1 = { '994' => { '910' => [] } };
$VAR1 = { '993' => { '909' => [] } };
$VAR1 = { '993' => { '904' => [] } };
$VAR1 = { '994' => { '985' => [] } };
$VAR1 = { '994' => { '983' => [] } };
$VAR1 = { '993' => { '902' => [] } };
$VAR1 = { '999' => { '992' => [ '905' ] } };
to this: 对此:
$VAR1 = { '999:' => [
{ '992' => [ '905' ] },
{ '993' => [
{ '909' => [] },
{ '904' => [] },
{ '902' => [] }
] },
{ '994' => [
{ '910' => [] },
{ '985' => [] },
{ '983' => [] }
] },
{ '995' => [
{ '101' => [] },
{ '102' => [] },
{ '103' => [] },
{ '104' => [] },
{ '105' => [] },
{ '106' => [] },
{ '107' => [] }
] },
{ '996' => [] },
{ '997' => [ '986', '987', '990', '984', '989', '988' ] },
{ '998' => [ '908', '906', '0', '998', '907' ] },
{ '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] }
]};
I think this is closer than anybody else has gotten: 我认为这比其他任何人都更接近:
This does most of what you want. 这可以满足您的大部分需求。 I did not store things in arrays of singular hashes, as I don't feel that that is useful.
我没有将东西存储在单个哈希的数组中,因为我觉得这不是很有用。
Your scenario is not a regular one. 您的方案不是常规方案。 I've tried to genericize this to some extent, but was not possible to overcome the singularity of this code.
我试图在某种程度上对此进行泛化,但是无法克服这段代码的奇异性。
First of all because it appears you want to collapse everything with the same id into a merged entity (with exceptions), you have to descend through the structure pulling the definitions of the entities. 首先,因为看起来你想要将具有相同id的所有内容合并到一个合并的实体中(有例外),你必须通过结构来下拉实体的定义。 Keeping track of levels, because you want them in the form of a tree.
跟踪关卡,因为你想要它们以树的形式。
Next, you assemble the ID table, merging entities as possible. 接下来,组装ID表,尽可能合并实体。 Note that you had 995 defined as an empty array one place and as a level another.
请注意,您将995定义为一个空数组,另一个定义为另一个数组。 So given your output, I wanted to overwrite the empty list with the hash.
所以考虑到你的输出,我想用哈希覆盖空列表。
After that, we need to move the root to the result structure, descending that in order to assign canonical entities to the identifiers at each level. 之后,我们需要将根移动到结果结构,然后降序,以便将规范实体分配给每个级别的标识符。
Like I said, it's not anything that regular. 就像我说的那样,这不是常规的。 Of course, if you still want a list of hashes which are no more than pairs, that's an exercise left to you.
当然,如果你仍然想要一个不超过对的哈希列表,那么这是一个留给你的练习。
use strict;
use warnings;
# subroutine to identify all elements
sub descend_identify {
my ( $level, $hash_ref ) = @_;
# return an expanding list that gets populated as we desecend
return map {
my $item = $hash_ref->{$_};
$_ => ( $level, $item )
, ( ref( $item ) eq 'HASH' ? descend_identify( $level + 1, $item )
: ()
)
;
} keys %$hash_ref
;
}
# subroutine to refit all nested elements
sub descend_restore {
my ( $hash, $ident_hash ) = @_;
my @keys = keys %$hash;
@$hash{ @keys } = @$ident_hash{ @keys };
foreach my $h ( grep { ref() eq 'HASH' } values %$hash ) {
descend_restore( $h, $ident_hash );
}
return;
}
# merge hashes, descending down the hash structures.
sub merge_hashes {
my ( $dest_hash, $src_hash ) = @_;
foreach my $key ( keys %$src_hash ) {
if ( exists $dest_hash->{$key} ) {
my $ref = $dest_hash->{$key};
my $typ = ref( $ref );
if ( $typ eq 'HASH' ) {
merge_hashes( $ref, $src_hash->{$key} );
}
else {
push @$ref, $src_hash->{$key};
}
}
else {
$dest_hash->{$key} = $src_hash->{$key};
}
}
return;
}
my ( %levels, %ident_map, %result );
#descend through every level of hash in the list
# @hash_list is assumed to be whatever you Dumper-ed.
my @pairs = map { descend_identify( 0, $_ ); } @hash_list;
while ( @pairs ) {
my ( $key, $level, $ref ) = splice( @pairs, 0, 3 );
$levels{$key} |= $level;
# if we already have an identity for this key, merge the two
if ( exists $ident_map{$key} ) {
my $oref = $ident_map{$key};
my $otyp = ref( $oref );
if ( $otyp ne ref( $ref )) {
# empty arrays can be overwritten by hashrefs -- per 995
if ( $otyp eq 'ARRAY' && @$oref == 0 && ref( $ref ) eq 'HASH' ) {
$ident_map{$key} = $ref;
}
else {
die "Uncertain merge for '$key'!";
}
}
elsif ( $otyp eq 'HASH' ) {
merge_hashes( $oref, $ref );
}
else {
@$oref = sort { $a <=> $b || $a cmp $b } keys %{{ @$ref, @$oref }};
}
}
else {
$ident_map{$key} = $ref;
}
}
# Copy only the keys that do not appear at higher levels to the
# result hash
if ( my @keys = grep { !$levels{$_} } keys %ident_map ) {
@result{ @keys } = @ident_map{ @keys } if @keys;
}
# then step through the hash to make sure that the entries at
# all levels are equal to the identity
descend_restore( \%result, \%ident_map );
Use CPAN! 使用CPAN! Try Hash::Merge
试试Hash :: Merge
# OO interface.
my $merge = Hash::Merge->new( 'LEFT_PRECEDENT' );
my %c = %{ $merge->merge( \%a, \%b ) };
See CPAN for more info, it pretty much does everything you would want to, and is fully customizable. 有关详细信息,请参阅CPAN,它几乎可以完成您想要的所有操作,并且可以完全自定义。
Give this recursive solution a try: 试试这个递归解决方案:
# XXX: doesn't handle circular problems...
sub deepmerge {
my (@structs) = @_;
my $new;
# filter out non-existant structs
@structs = grep {defined($_)} @structs;
my $ref = ref($structs[0]);
if (not all(map {ref($_) eq $ref} @structs)) {
warn("deepmerge: all structs are not $ref\n");
}
my @tomerge = grep {ref($_) eq $ref} @structs;
return qr/$tomerge[0]/ if scalar(@tomerge) == 1 and $ref eq 'Regexp';
return $tomerge[0] if scalar(@tomerge) == 1;
if ($ref eq '') {
$new = pop(@tomerge); # prefer farthest right
}
elsif ($ref eq 'Regexp') {
$new = qr/$tomerge[$#tomerge]/;
}
elsif ($ref eq 'ARRAY') {
$new = [];
for my $i (0 .. max(map {scalar(@$_) - 1} @tomerge)) {
$new->[$i] = deepmerge(map {$_->[$i]} @tomerge);
}
}
elsif ($ref eq 'HASH') {
$new = {};
for my $key (uniq(map {keys %$_} @tomerge)) {
$new->{$key} = deepmerge(map {$_->{$key}} @tomerge);
}
}
else {
# ignore all other structures...
$new = '';
}
return $new;
}
Modify it to your hearts content to achieve the desired result. 将其修改为您的心灵内容,以达到理想的效果。
Upon further investigation, I noticed you're merging them in some different way than the above algorithm. 经过进一步调查,我注意到你正在以与上述算法不同的方式合并它们。 Maybe just use this as an example then.
也许只是以此为例。 Mine does this:
我这样做:
deepmerge({k => 'v'}, {k2 => 'v2'});
# returns {k => 'v', k2 => 'v2'}
And similar things for arrays. 和数组类似的东西。
I indented your wanted output as it was hard to read, for the benefit of other people who want to answer. 为了其他想要回答的人的利益,我缩小了您想要的输出,因为它很难阅读。 I'm still thinking of an answer.
我还在想一个答案。
$VAR1 = { '999:' => [
{ '992' => [ '905' ] },
{ '993' => [
{ '909' => [] },
{ '904' => [] },
{ '902' => [] }
]
},
{ '994' => [
{ '910' => [] },
{ '985' => [] },
{ '983' => [] }
]
},
{ '995' => [
{ '101' => [] },
{ '102' => [] },
{ '103' => [] },
{ '104' => [] },
{ '105' => [] },
{ '106' => [] },
{ '107' => [] }
]
},
{ '996' => [] },
{ '997' => [ '986', '987', '990', '984', '989', '988' ] },
{ '998' => [ '908', '906', '0', '998', '907' ] },
{ '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] }
]
};
I don't see the point of all those single entry hashes though, would not the following be better? 我没有看到所有那些单一入口哈希的重点,以下不会更好吗?
$VAR1 = { '999:' => {
'992' => [ '905' ],
'993' => {
'909' => [],
'904' => [],
'902' => []
},
'994' => {
'910' => [],
'985' => [],
'983' => []
},
'995' => {
'101' => [],
'102' => [],
'103' => [],
'104' => [],
'105' => [],
'106' => [],
'107' => []
},
'996' => [],
'997' => [ '986', '987', '990', '984', '989', '988' ],
'998' => [ '908', '906', '0', '998', '907' ],
'991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ]
}
};
Assuming the above data is in a file dump.txt, you can eval it piece by piece. 假设上面的数据在文件dump.txt中,您可以逐个评估它。
Updated code below 更新了以下代码
use strict;
use File::Slurp;
my $final_data = {};
my @data = map {eval $_} (read_file("dump.txt") =~ /\$VAR1 = ([^;]+);/gs);
foreach my $element (@data) {
my $key = (keys %$element)[0];
$final_data->{$key} ||= [];
push @{$final_data->{$key}}, $element->{$key}
};
use Data::Dumper;
print Data::Dumper->Dump([$final_data]);
If you want to completely deep merge, you can at the end pass $final_data through this (not tested!!!) deep merger: 如果你想完全深度合并,你可以在最后通过这个(未测试的!!!)深度合并传递$ final_data:
# Merge an array of hashes as follows:
# IN: [ { 1 => 11 }, { 1 => 12 },{ 2 => 22 } ]
# OUT: { 1 => [ 11, 12 ], 2 => [ 22 ] }
# This is recursive - if array [11,12] was an array of hashrefs, we merge those too
sub merge_hashes {
my $hashes = @_[0];
return $hashes unless ref $hashes eq ref []; # Hat tip to brian d foy
return $hashes unless grep { ref @_ eq ref {} } @$hashes; # Only merge array of hashes
my $final_hashref = {};
foreach my $element (@$hashes) {
foreach my $key (keys %$element) {
$final_hashref->{$key} ||= [];
push @{ $final_hashref->{$key} }, $element->{$key};
}
}
foreach my $key (keys %$final_hashref) {
$final_hashref->{$key} = merge_hashes($final_hashref->{$key});
}
return $final_hashref;
}
Use push
and autovivification. 使用
push
和自动修复。
Start with the usual front matter: 从通常的前面问题开始:
#! /usr/bin/perl
use warnings;
use strict;
Read your sample input from the DATA
filehandle and create a datastructure similar to the one you dumped: 从
DATA
文件句柄中读取样本输入并创建与您转储的DATA
结构类似的DATA
结构:
my @hashes;
while (<DATA>) {
my $VAR1;
$VAR1 = eval $_;
die $@ if $@;
push @hashes => $VAR1;
}
Your input has two cases: 您的输入有两种情况:
Note the use of $_[0]
. 注意使用
$_[0]
。 The semantics of Perl subroutines are such that the values in @_
are aliases rather than copies. Perl子例程的语义使得
@_
中的值是别名而不是副本。 This lets us call merge
directly without having to first create a bunch of scaffolding to hold the merged contents. 这让我们可以直接调用
merge
而无需先创建一堆脚手架来保存合并的内容。 The code will break if you copy the value instead. 如果您复制该值,代码将会中断。
sub merge {
my $data = shift;
if (ref($data) eq "ARRAY") {
push @{ $_[0] } => @$data;
}
else {
foreach my $k (%$data) {
merge($data->{$k} => $_[0]{$k});
}
}
}
Now we walk @hashes
and incrementally merge their contents into %merged
. 现在我们走
@hashes
并逐步将其内容%merged
到%merged
。
my %merged;
foreach my $h (@hashes) {
foreach my $k (keys %$h) {
merge $h->{$k} => $merged{$k};
}
}
We don't know in what order the values arrived, so run a final cleanup pass to sort the arrays: 我们不知道值到达的顺序是什么,因此运行最终的清理过程来对数组进行排序:
sub sort_arrays {
my($root) = @_;
if (ref($root) eq "ARRAY") {
@$root = sort { $a <=> $b } @$root;
}
else {
sort_arrays($root->{$_}) for keys %$root;
}
}
sort_arrays \%merged;
The Data::Dumper module is great for quick debugging! Data :: Dumper模块非常适合快速调试!
use Data::Dumper;
$Data::Dumper::Indent = 1;
print Dumper \%merged;
Place a copy of the input from your question into the special DATA
filehandle: 将问题输入的副本放入特殊的
DATA
文件句柄中:
__DATA__
$VAR1 = { '999' => { '998' => [ '908', '906', '0', '998', '907' ] } };
$VAR1 = { '999' => { '991' => [ '913', '920', '918', '998', '916', '919', '917', '915', '912', '914' ] } };
$VAR1 = { '999' => { '996' => [] } };
$VAR1 = { '999' => { '995' => [] } };
$VAR1 = { '999' => { '994' => [] } };
$VAR1 = { '999' => { '993' => [] } };
$VAR1 = { '999' => { '997' => [ '986', '987', '990', '984', '989', '988' ] } };
$VAR1 = { '995' => { '101' => [] } };
$VAR1 = { '995' => { '102' => [] } };
$VAR1 = { '995' => { '103' => [] } };
$VAR1 = { '995' => { '104' => [] } };
$VAR1 = { '995' => { '105' => [] } };
$VAR1 = { '995' => { '106' => [] } };
$VAR1 = { '995' => { '107' => [] } };
$VAR1 = { '994' => { '910' => [] } };
$VAR1 = { '993' => { '909' => [] } };
$VAR1 = { '993' => { '904' => [] } };
$VAR1 = { '994' => { '985' => [] } };
$VAR1 = { '994' => { '983' => [] } };
$VAR1 = { '993' => { '902' => [] } };
$VAR1 = { '999' => { '992' => [ '905' ] } };
A sample of the output is below: 输出样本如下:
'994' => { '910' => [], '985' => [], '983' => [] }, '999' => { '993' => [], '992' => [ '905' ], '997' => [ '984', '986', '987', '988', '989', '990' ],
wow. 哇。 thanks so much everyone (especially Axeman)!
非常感谢大家(特别是Axeman)! sorry for the lack of code or clarification, I was trying to generate a tree, and did try Hash::Merge, but could not for the life of me resolve the coined-995 problem of replacing the empty 995 with the non-empty 995;
抱歉没有代码或澄清,我试图生成一个树,并尝试Hash :: Merge,但不能为我的生活解决用非空995替换空995的创造995问题; Axeman's solution works beautifully and I really appreciate the help/collaboration!
Axeman的解决方案非常有效,我非常感谢帮助/合作! (also tried the others and it either did the same thing as Hash::Merge, or it actually got rid of some branches).
(也尝试了其他人,它或者做了与Hash :: Merge相同的事情,或者它实际上摆脱了一些分支)。
some background on the input: had a set of hashes, each had keys (all same level) and two of which defined a) a parent to another, and b) itself (the rest were children), and so with a tree, i figured a hash was perfect, came up with a set of new hashes {a}->{b}->[c], and here we are... 关于输入的一些背景:有一组哈希,每个都有键(所有相同的级别),其中两个定义了a)父级到另一个,和b)本身(其余的是孩子),所以用树,我想到哈希是完美的,想出了一组新的哈希{a} - > {b} - > [c],我们在这里......
again, thanks everyone and Axeman! 再次感谢大家和Axeman!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.