简体   繁体   English

通过使用顶点数和边数之间的关系在可能格式错误的二叉树中查找循环

[英]Finding a loop in a possibly malformed binary tree by using the relation between the number of vertices and number of edges

I faced this question in an interview recently. 我最近在一次采访中遇到了这个问题。 The original question was 最初的问题是

Given a pointer to a struct (which is structured so that it can point either to a Binary tree or a doubly linked list), write a function which returns whether it is pointing to a binary tree or a DLL.The struct is defined like this 给定一个指向结构的指针(其结构使得它可以指向二进制树或双向链表),编写一个函数,返回它是指向二叉树还是DLL。结构定义如下

struct node
    {
     /*data member*/
     node *l1;
     node *l2;
    };

I dived into the problem straightaway but then I realized there is some ambiguity in the problem. 我直接陷入了问题,但后来我意识到这个问题有些含糊不清。 What if the pointer doesn't points to either of them ( that is it is a malformed DLL or a malformed tree). 如果指针没有指向它们中的任何一个(即它是格式错误的DLL或格式错误的树),该怎么办? So the interviewer told me that then I have to write the function such that it can return all three cases. 因此,面试官告诉我,然后我必须编写功能,以便它可以返回所有三种情况。 So the return value of the function becomes an enum of the form 因此函数的返回值成为表单的枚举

enum StatesOfRoot 
   {
   TREE,
   DLL,
   INVALID_DATA_STRUCTURE,  /* case of malformed dll or malformed tree */
   EITHER_TREE_DLL,         /* case when there is only 1 node */
   };

So the problem reduced to verifying the property of binary tree and DLL.For DLL it was easy. 因此问题减少到验证二叉树和DLL的属性。对于DLL来说很容易。 For binary tree the only verification that I could think was that there should not be more than one path to a node from the root.(Or there should not be any loops) So I proposed that we do depth first search and keep tracking the visited nodes using either a HashMap(which the interviewer rejected straightaway) or maintaining a set of visited nodes using a BST (I wanted to use std::set but the interviewer suddenly popped up another restriction that I can't use STL).He rejected this idea saying that I am not allowed to use any other data structure. 对于二叉树,我能想到的唯一验证是从根目录到节点不应该有多条路径。(或者不应该有任何循环)所以我建议我们进行深度优先搜索并继续跟踪访问节点使用HashMap(访问者直接拒绝)或使用BST维护一组访问节点(我想使用std :: set但是访谈者突然弹出另一个限制,我不能使用STL)。他拒绝了这个想法说我不允许使用任何其他数据结构。 Then I proposed a modified version of tortoise and hare problem ( Considering each branch of Binary tree as a singly link list) to which he said this won't work. 然后我提出了乌龟和野兔问题的修改版本(考虑到二叉树的每个分支作为一个单独的链接列表),他说这不起作用。 After that I went on to propose few more solutions which were sort of ugly ( involved deleting nodes,maintaining a copy of tree etc) 之后我继续提出了一些丑陋的解决方案(涉及删除节点,维护树的副本等)

The Core of the problem 问题的核心

Then the interviewer proposed his solution. 然后面试官提出了他的解决方案。 He said we can count the number of vertices and number of edges and assert the relation number of vertices=number of edges +1 (A property which has to hold for a binary tree) . 他说我们可以计算顶点的数量和边数,并断言顶点的关系数=边数+1 (一个必须为二叉树保持的属性)。 What baffled me was how can we count the number of vertices (without using any additional data structure )? 困扰我的是我们如何计算顶点的数量(不使用任何其他数据结构)? He said It can be done by simply performing any traversal ( preorder,postorder,inorder ) . 他说可以通过简单地执行任何遍历(预订,后序,顺序)来完成。 I questioned back how will we prevent an infinite loop if there is a loop in the tree since we are not tracking the visited nodes. 我质疑如果树中有循环,我们将如何阻止无限循环,因为我们没有跟踪被访问的节点。 He said this is possible but didn't told how. 他说这是可能的,但没有告诉如何。 I am seriously doubting his approach. 我很怀疑他的态度。 Can anyone provide some insight on whether the solution proposed by him was right? 谁能提供一些关于他提出的解决方案是否正确的见解? If yes how would you explicitily maintain a count of distinct vertices? 如果是,您将如何明确地保持不同顶点的数量? Note that what you are passed is just a pointer,you have no other information. 请注意,您传递的内容只是一个指针,您没有其他信息。

PS: Later I received a notification that I am through to the next round without even answering the final solution to the interviewer. PS:后来我收到了一条通知,说明我还没有回答面试官的最终解决方案。 Was it supposed to be trick round ? 这应该是骗局吗?

EDIT : 编辑

Just to make things clear,if we assume that the 3rd case is not present (that is we are guaranteed its a dll or a binary tree)then the problem is very trivial.Its the tree part of the 3rd case that is driving me crazy. 只是为了清楚地表明,如果我们假设第三种情况不存在(即我们保证它是一个dll或二叉树)那么这个问题就非常微不足道了。它的第三种情况的树部分让我发疯了。 Kindly note this point while answering. 在回答时请注意这一点。

You are right to be skeptical of his solution. 你对他的解决方案持怀疑态度是正确的。

Doubly-Linked list is the easy one. 双向链表很容易。 DLLs enforce the invariants: DLL强制执行不变量:

  1. Except for null nodes, a node's left node's right node is itself. 除了空节点,节点的左节点的右节点本身。
  2. Except for null nodes, a node's right node's left node is itself. 除了空节点,节点的右节点的左节点本身。
  3. Noncyclic DLLs will eventually reach a null as you keep following left. 当你继续跟随左边时,非循环DLL最终将达到null。
  4. Noncyclic DLLs will eventually reach a null as you keep following right. 非循环DLL将最终达到null,因为你一直遵循正确。
  5. Cyclic DLLs will eventually reach the starting node as you keep following left. 当你继续向左移动时,循环DLL最终会到达起始节点。

The preceeding is easy to check with only an extra temporary variable, and walking over the DLL. 前面很容易检查只有一个额外的临时变量,并走过DLL。

(Note: checking 3 and 4, or 5 may take a long time.) (注意:检查3和4,或5可能需要很长时间。)

Binary Tree is the hard one. 二叉树很难。 BTs enforce the invariants: BT强制执行不变量:

  1. "No Loops" can be shown by any of the following: “No Loops”可以通过以下任何一种方式显示:
    • Demonstrate no two nodes point to the same node and no nodes point to the root. 证明没有两个节点指向同一节点,没有节点指向根节点。
    • Demonstrate that all paths from the root eventually end at a leaf. 证明来自根的所有路径最终都以叶子结束。
    • Demonstrate that all referenced nodes are distinct. 证明所有引用的节点都是不同的。
  2. "No Merges" can be shown by any of the following: “No Merges”可以通过以下任何一种方式显示:
    • Demonstrate that no two nodes point to the same node. 证明没有两个节点指向同一节点。
    • Demonstrate that all referenced nodes are distinct. 证明所有引用的节点都是不同的。

As you suggested, these may be determined by traversing the tree and marking each node visited to ensure that no node gets visited twice, or alternatively storing a list of each node visited (such as in a hash-set or other structure) to quickly look-up if the node is distinct. 正如您所建议的那样,这些可以通过遍历树并标记所访问的每个节点来确定,以确保没有任何节点被访问两次,或者可选地存储所访问的每个节点的列表(例如在散列集或其他结构中)以快速查看-up如果节点是不同的。

You could probably validate that there are no loops in the tree without another data structure, by simply traversing the tree and keeping a value of your current depth in the tree, if you got deeper in the tree than there is memory in the computer (or visited more nodes), you would be sure to have an infinite loop. 您可以通过简单地遍历树并在树中保留当前深度的值来验证树中没有其他数据结构的循环,如果您在树中更深入而不是计算机中的内存(或者访问过更多节点),你肯定会有一个无限循环。

However, that doesn't help us distinguish Binary "Directed Acyclic Graphs" (DAGs) from Binary Trees. 然而,这无助于我们区分Binary“Directed Acyclic Graphs”(DAGs)和Binary Trees。

If, however, we knew the count of elements in the tree, as this is usually the case for Library implementations of binary trees. 但是,如果我们知道树中元素的数量,那么二叉树的库实现通常就是这种情况。 You could detect an infinite loop by counting the number of edges compared to the previously known number of nodes, like the interviewer suggested. 您可以通过计算与先前已知的节点数相比的边数来检测无限循环,如访问者所建议的那样。

Without knowing that number ahead of time, it is difficult to know the difference between an infinitely large tree and a large finite tree. 如果不提前知道这个数字,就很难知道无限大树和大有限树之间的区别。 (Unless you know the memory size of the computer, or other information like how long it took to make the tree, etc.) (除非你知道计算机的内存大小,或者其他信息,比如制作树的时间等等)

This still does not help us detect the "No Merges" invariant. 这仍然无法帮助我们检测出“No Merges”不变量。

I can't think of any useful way to determine that No Merges exist, without showing that no node is referenced twice by either storing visited nodes in an external data structure, or marking each node as visited when you visit it. 我想不出有任何有用的方法可以确定是否存在No Merges,而没有通过在外部数据结构中存储访问节点或在访问时将每个节点标记为已访问而没有显示任何节点被引用两次。

As a final resort, you could do the following: 作为最后的手段,您可以执行以下操作:

  1. Show there are "No Loops" based on the tree depth (or number of visited nodes) compared to computer memory. 与计算机内存相比,显示基于树深度(或访问节点数)的“无循环”。 (or as below, in the edit) (或如下,在编辑中)
  2. Demonstrate "No Merges" through this method. 通过这种方法展示“无合并”。
    • Start at root's left child, ie depth 1 of the tree. 从root的左子节点开始,即树的深度1。
    • Visit every node at depth 1 and depth 0 and verify that only the direct parent references the selected node. 访问深度1和深度0的每个节点,并验证只有直接父节点引用所选节点。
    • Do the same for the root's right child. 为root的右孩子做同样的事。
    • Continue this process for each node in the tree: 对树中的每个节点继续此过程:
      1. select a node, keeping a reference to its direct parent, 选择一个节点,保持对其直接父节点的引用,
      2. visit every node higher in the tree and at the same depth as the selected node, 访问树中较高的每个节点,并与所选节点的深度相同,
      3. verify that out of the visited nodes, only the direct parent references the selected child. 验证在访问过的节点之外,只有直接父级引用所选子级。
    • Once this is done, traverse the tree again to verify that the left and right pointers from every node do not both point to the same node. 完成此操作后,再次遍历树以验证来自每个节点的左右指针都不指向同一节点。

This process would only take a few extra variables, but would take a lot of time, since you individually compare each node to every node higher or at the same depth in the tree. 此过程只需要一些额外的变量,但需要花费大量时间,因为您将每个节点单独比较到树中较高或相同深度的每个节点。

My intuition tells me that the above procedure would a v-squared algorithm, instead of just being order v. 我的直觉告诉我,上面的过程将是一个v平方算法,而不仅仅是秩序v。

Add a comment if any of you think of another way to approach this. 如果您有任何想到另一种方法来解决此问题,请添加评论。


Edit: you may be able to verify the "No Loops" here by simply extending the search to not just every node at same depth and higher, but comparing with every node in the tree. 编辑:您可以通过简单地将搜索扩展到相同深度和更高的每个节点,但是与树中的每个节点进行比较,来验证“无循环”。 You would need to do this in a progressive algorithm, compare each node with every node above it in the tree and its own depth, then check against all nodes in the tree from 1 to 5 nodes deeper than it, then from 6-10 generations lower, and so forth. 您需要在渐进算法中执行此操作,将每个节点与树中其上方的每个节点及其自身深度进行比较,然后检查树中距离比其深1到5个节点的所有节点,然后检查6-10代更低,等等。 If you check in a non-progressive way, you could get stuck searching infinitely. 如果以非渐进方式检查,您可能会无限期地陷入困境。

First of all, the original problem clearly states that the correct input is either a DLL or a tree, so IMO there's no ambiguity: it just doesn't matter how your code works if the input is wrong. 首先,原始问题清楚地表明正确的输入是DLL或树,因此IMO没有歧义:如果输入错误,代码的工作方式无关紧要。

Anyway, you and your interviewer got driven away to the 'what if' land. 无论如何,你和你的面试官被驱逐到'假设'的土地。

But then, what does he mean by 'not using additional data structures' as you cannot traverse even a guaranteed correct binary tree without using a stack to remember the turning points (either using recursion mechanism or by manually creating a stack data structure). 但是,他的意思是“不使用额外的数据结构”,因为你不能使用堆栈记住转折点(使用递归机制或通过手动创建堆栈数据结构)来遍历保证正确的二叉树。

So I assume we can use stack and recursion. 所以我假设我们可以使用堆栈和递归。

A little note: yes, I know we can do it in constant memory if the node structure contains pointers up the tree (we can modify the pointers and bring them back at the end), but here we don't have those, so I drop the proof for this one and assume this "obvious": we have to be able to use recursion at least. 一点注意:是的,我知道如果node结构包含指向树的指针,我们可以在常量内存中执行它(我们可以修改指针并将它们带回到最后),但是这里我们没有那些,所以我放弃这个证明并假设这个“显而易见”:我们必须能够至少使用递归。

Well, I wouldn't call the following 'a simple inorder traversal' but here you have it: 好吧,我不打电话给以下'一个简单的顺序遍历',但在这里你有它:

#include <stdio.h>
#include <stdbool.h>

struct node
    {
     /*data member*/
     struct node *l1;
     struct node *l2;
    };

// This one counts the nodes in a subtree of V with a depth no more than l that are equal to V0
int CountEqual(struct node* V0, struct node* V, int l)
{
    int thisOneIsEqual = 0;
    if( V == NULL ) {
        return 0;
    }

    if( l == 0 ) {
        return 0;
    }

    if( V0 == V ) {
        thisOneIsEqual = 1;
    }

    return thisOneIsEqual +
        CountEqual(V0, V->l1, l - 1) +
        CountEqual(V0, V->l2, l - 1);
}

// This one checks whether there're equal nodes in a subtree of root with a depth of L
bool Eqs(struct node* root, int L, struct node* V, int l)
{
    if( V == 0 ) {
        return false;
    }

    if( l == 0 ) {
        return false;
    }

    if( CountEqual(V, root, L) > 1 ) {
        return true;
    }

    return
        Eqs(root, L, V->l1, l - 1) ||
        Eqs(root, L, V->l2, l - 1);
}

// This checks whether the depth of the tree rooted at V is no more than l
bool HeightLessThanL(struct node* V, int l)
{
    if( V == 0 ) {
        return true;
    }

    if( l == 0 ) {
        return false;
    }

    return
        HeightLessThanL(V->l1, l - 1) &&
        HeightLessThanL(V->l2, l - 1);
}

bool isTree(struct node* root)
{
    int l = 1;
    while( 1 ) {
        if( HeightLessThanL(root, l - 1) ) {
            return true;
        }

        if( Eqs(root, l, root, l) ) {
            return false;
        }

        l++;
    }
}

// A simple test: build a correct tree, then add cycles, equal nodes etc.
#define SIZE 5
int main()
{
    struct node graph[SIZE];
    int i;

    for( i = 0; i < SIZE; ++i ) {
        graph[i].l1 = 0;
        graph[i].l2 = 0;
        if( 2 * i + 1 < SIZE ) {
            graph[i].l1 = graph + 2 * i + 1;
        }
        if( 2 * i + 2 < SIZE ) {
            graph[i].l2 = graph + 2 * i + 2;
        }
    }

    graph[1].l2 = graph + 3;

    printf( "%d\n", isTree( graph ) );
    return 0;
}

The idea is that for some L either we know that we have a tree of height L, or there're two equal nodes in a subtree of depth L. 我们的想法是,对于某些L,我们知道我们有一个高度为L的树,或者在深度为L的子树中有两个相等的节点。

You have to assume some common interfaces to DLL's and trees. 您必须假设DLL和树的一些通用接口。 An abstract parent might define a virtual toHead() where a DLL would go to a head node, and a tree would go to root and return the node obeject etc. Hash tables are over kill here. 抽象父级可以定义一个虚拟的toHead(),其中DLL将转到头节点,而一个树将转到root并返回节点obeject等。这里的哈希表被过度杀死。 My C/C++ is rusty, so the pointers might be a little wrong, however, what you are looking for is that the location in memory is the same as the value of "copyHead" since the value stored in "copyHead" is the location of the head... hope that makes since to you. 我的C / C ++是生锈的,所以指针可能有点不对,但是,你要找的是内存中的位置与“copyHead”的值相同,因为存储在“copyHead”中的值是位置头部......希望这对你有所帮助。

type *myType;
myType = &structure;

node *copyHead = myType.toHead(); // Where toHead() returns a pointer to the head.

while( copyHead != &(*myType.next()) ) {
    if(*myType.curr() == null) { return "is tree"}
}

return "is DLL";

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM