简体   繁体   English

克隆带有随机指针的二叉树

[英]Clone a Binary Tree with Random Pointers

Can anyone explain the way of cloning the binary tree with random pointers apart from left to right?谁能解释用从左到右的随机指针克隆二叉树的方法? every node has following structure.每个节点都有以下结构。

struct node {  
    int key; 
    struct node *left,*right,*random;
} 

This is very popular interview question and I am able to figure out the solution based on hashing(which is similar to cloning of linked lists).这是一个非常受欢迎的面试问题,我能够根据散列(类似于链表的克隆)找出解决方案。 I tried to understand the solution given in Link (approach 2) but am not able to figure out what does it want to convey by reading code also.我试图理解链接(方法 2)中给出的解决方案,但也无法通过阅读代码弄清楚它想传达什么。 I don't expect solution based on hashing as it is intuitive and pretty straight forward.我不期望基于散列的解决方案,因为它直观且非常直接。 Please explain solution based on modifying binary tree and cloning it.请解释基于修改二叉树并克隆它的解决方案。

The solution presented is based on the idea of interleaving both trees, the original one and its clone.所提出的解决方案基于交错两棵树的想法,原始树和它的克隆树。

For every node A in the original tree, its clone cA is created and inserted as A 's left child.对于原始树中的每个节点A ,都会创建其克隆cA并将其作为A的左子节点插入。 The original left child of A is shifted one level down in the tree structure and becomes a left child of cA . A的原始左孩子在树结构中向下移动一级,成为cA的左孩子。

For each node B , which is a right child of its parent P (ie, B == P->right ), a pointer to its clone node cB is copied to a clone of its parent.对于每个节点B ,它是其父节点P的右子节点(即B == P->right ),指向其克隆节点cB的指针被复制到其父节点的克隆节点。

       P                     P
      / \                   / \
     /   \                 /   \
    A     B               cP    B
   /       \             / \   / \
  /         \           /   \ /   \
 X           Z         A    cB     Z
                      /       \   /
                     cA        cZ
                    /
                   X
                  /
                 cX

Finally we can extract the cloned tree by traversing the interleaved tree and unlinking every other node on each 'left' path (starting from root->left ) together with its 'rightmost' descendants path and, recursively, every other 'left' descendant of those and so on.最后,我们可以通过遍历交错树并取消链接每个“左”路径(从root->left )上的所有其他节点及其“最右”后代路径以及递归地每个其他“左”后代的链接来提取克隆树那些等等。

What's important, each cloned node is a direct left child of its original node.重要的是,每个克隆节点都是其原始节点的直接左子节点。 So in the middle part of the algorithm, after inserting the cloned nodes but before extracting them, we can traverse the whole tree walking on original nodes, and whenever we find a random pointer, say A->random == Z , we can copy the binding into clones by setting cA->random = cZ , which resolves to something like所以在算法的中间部分,在插入克隆节点之后但在提取它们之前,我们可以在原始节点上遍历整棵树,每当我们找到一个random指针,比如A->random == Z ,我们可以复制通过设置cA->random = cZ绑定到克隆中,它解析为类似

A->left->random = A->random->left;

This allows cloning random pointers directly and does not require additional hash maps (at the cost of interleaving new nodes into the original tree and extracting them later).这允许直接克隆random指针,不需要额外的哈希映射(代价是将新节点交错到原始树中并稍后提取它们)。

The interleaving method can be simplified a little, I think.我认为交错方法可以简化一些。

1) For every node A in the original tree, create clone cA with the same left and right pointers as A . 1) 对于原始树中的每个节点A ,创建具有与A相同左右指针的克隆cA Then, set A s left pointer to cA .然后,将A的左指针设置为cA

       P                     P
      / \                   / 
     /   \                 /   
    A     B               cP    
   /       \             / \  
  /         \           /   \ 
 X           Z         A     B    
                      /     / 
                     cA    cB
                    /        \ 
                   X          Z
                  /          /     
                 cX        cZ      

2) Now given a node and it's clone (which is just node.left ), the random pointer for the clone is: node.random.left (if node.random exists). 2)现在给定一个node及其克隆(就是node.left ),克隆的随机指针是: node.random.left (如果node.random存在)。

3) Finally, the binary tree can be un-interleaved. 3) 最后,二叉树可以是非交错的。

I find this interleaving makes reasoning about the code much simpler.我发现这种交错使得对代码的推理变得更加简单。

Here is the code:这是代码:

def clone_and_interleave(root):
    if not root:
        return

    clone_and_interleave(root.left)
    clone_and_interleave(root.right)

    cloned_root = Node(root.data)
    cloned_root.left, cloned_root.right = root.left, root.right

    root.left = cloned_root
    root.right = None # This isn't necessary, but doesn't hurt either.

def set_randoms(root):
    if not root:
        return

    cloned_root = root.left
    set_randoms(cloned_root.left)
    set_randoms(cloned_root.right)

    cloned_root.random = root.random.left if root.random else None

def unterleave(root):
    if not root:
        return (None, None)

    cloned_root = root.left
    cloned_root.left, root.left = unterleave(cloned_root.left)
    cloned_root.right, root.right = unterleave(cloned_root.right)

    return (cloned_root, root)


def cloneTree(root):
    clone_and_interleave(root)
    set_randoms(root)
    cloned_root, root = unterleave(root)

    return cloned_root

The terminology used in those interview questions is absurdly bad.这些面试问题中使用的术语非常糟糕。 It's the case of one unwitting kuckledgragger somewhere calling that pointer the “random” pointer and everyone just nods and accept this as if it was some CS mantra from an ivory tower.这是一个不知情的 kuckledgragger 在某处称该指针为“随机”指针的情况,每个人都只是点头并接受这一点,就好像它是来自象牙塔的一些 CS 咒语。 Alas, it's sheer lunacy.唉,简直是脑残。

Either what you have is a tree or it isn't.要么你拥有的是一棵树,要么不是。 A tree is an acyclic directed graph with at most a single edge directed toward any node, and adding extra pointers can't change it - the things the pointers point to must retain this property.树是一个无环有向图,最多只有一条边指向任何节点,添加额外的指针不能改变它——指针指向的东西必须保留这个属性。

But when the node has a pointer that can point to any other node, it's not a tree.但是当节点有一个可以指向任何其他节点的指针时,它就不是一棵树。 You got a proper directed graph with cycles in it, and looking at it as if it were a tree is silly at this point.你得到了一个带有循环的正确有向图,在这一点上把它看成一棵树是愚蠢的。 It's not a tree.它不是一棵树。 It's just a generic directed edge graph that you're cloning.它只是您要克隆的通用有向边图。 So any relevant directed graph cloning technique will work, but the insistence on using the terms “tree” and “random pointer” obscure this simple fact, and confuse the matters terribly.所以任何相关的有向图克隆技术都可以使用,但是坚持使用术语“树”和“随机指针”掩盖了这个简单的事实,并严重混淆了问题。

This snafu indicates that whoever came up with the question was not qualified to be doing any such interviewing.这种混乱表明提出问题的人没有资格进行任何此类采访。 This stuff is covered in any decent introductory data structure textbook so you'd think it shouldn't present some astronomical uphill effort to just articulate what you need in a straightforward manner.任何体面的介绍性数据结构教科书都涵盖了这些内容,因此您认为它不应该提供一些天文数字的艰苦努力来以直接的方式阐明您的需求。 Let the interviewees deal with users who can't articulate themselves once they get that job - the data structure interview is neither the place nor time for that.让受访者与那些一旦获得这份工作就无法表达自己的用户打交道——数据结构面试既不是那个地方也不是那个时间。 It reeks of stupidity and carelessness, and leaves permanently bad aftertaste.它散发着愚蠢和粗心的味道,并留下永久的糟糕回味。 It's probably yet another stupid thing that ended up in some “interview question bank” because one poor soul got it asked by a careless idiot once and now everyone treats it as gospel.这可能是另一件最终出现在“面试题库”中的愚蠢事情,因为一个可怜的人曾经被一个粗心的白痴问过,现在每个人都把它当作福音。 It's yet again the blind leading the blind and cluelessness abounds.又是瞎子领瞎子,无知比比皆是。

Copying arbitrary graphs is a well solved problem and in all cases you need to retain the state of your traversal somehow.复制任意图形是一个很好解决的问题,在所有情况下,您都需要以某种方式保留遍历状态。 Whether it's done by inserting nodes into the original graph to mark the progress - one could call it intrusive marking - or by adding data to the copy in progress and removing it when done, or by using an auxiliary structure such as a hash, or by doing repeat traversal to check it you made a copy of that node elsewhere - is of secondary importance, since the purpose Is always the same: to retain the same state information, just encoding it in various ways, trading off speed and memory use (as always).无论是通过将节点插入到原始图中来标记进度来完成的——可以称之为侵入式标记——或者通过将数据添加到正在进行的副本并在完成时将其删除,或者通过使用诸如散列之类的辅助结构,或者通过重复遍历以检查它你在别处复制了那个节点 - 是次要的,因为目的总是相同的:保留相同的状态信息,只是以各种方式对其进行编码,权衡速度和内存使用(如总是)。

When thinking of this problem, you need to tell yourself what sort of state you need to finish the copy, and abstract it away, and implement the copy using this abstract interface.在思考这个问题的时候,你需要告诉自己完成副本需要什么样的状态,并将其抽象出来,并使用这个抽象接口来实现副本。 Then you can implement it in a few ways, but at that point the copy itself doesn't obscure things since you look at this simple abstract state-preserving interface and not at the copy process.然后您可以通过几种方式实现它,但此时副本本身并不会模糊事物,因为您查看的是这个简单的抽象状态保留接口而不是复制过程。

In real life the choice of any particular implementation highly depends on the amount and structure of data being copied, and the extent you have control over it all.在现实生活中,任何特定实现的选择在很大程度上取决于被复制数据的数量和结构,以及您对这一切的控制程度。 If you're the one controlling the structure of the nodes, then you'll usually find that they have some padding that you could use to store a bit of state information.如果您是控制节点结构的人,那么您通常会发现它们有一些可用于存储一些状态信息的填充。 Or you'll find that the memory block allocated for the nodes is actually larger than requested: malloc will often end up providing a block larger than asked for, and all reasonable platforms have APIs that let you retrieve the actual size of the block and thus check if there's maybe some leftover space just begging to be used.或者你会发现为节点分配的内存块实际上比请求的要大: malloc最终会提供一个大于请求的块,所有合理的平台都有 API 可以让你检索块的实际大小,因此检查是否有一些剩余的空间只是乞求使用。 These APIs are not always fast so be careful there of course.这些 API 并不总是很快,所以当然要小心。 But you see where this is going: such optimization requires benchmarks and a clear need driven by demands of the application.但是你会看到这是怎么回事:这种优化需要基准测试和由应用程序需求驱动的明确需求。 Otherwise, use whatever is least likely to be buggy - ideally a C library that provides data structures that you could use right away.否则,请使用最不可能出错的任何东西 - 理想情况下是一个 C 库,它提供您可以立即使用的数据结构。 If you need a cyclic graph there are libraries that do just that - use them first.如果你需要一个循环图,有一些库可以做到这一点 - 首先使用它们。

But boy, do I hate that idiotic “random” name of the pointer.但是男孩,我讨厌指针的那个愚蠢的“随机”名称。 Who comes up with this nonsense and why do they pollute so many minds?这种胡说八道是谁想出来的,为什么要污染这么多人的心? There's nothing random about it.没有什么是随机的。 And a tree that's not a tree is not a tree.不是树的树不是树。 I'd fail that interviewer in a split second…我会在一瞬间让那个面试官失败......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM