简体   繁体   English

连接超过1个节点的子列表时,链表合并合并的C ++实现失败

[英]C++ Implementation of Mergesort of Linked-List Fails on Joining Sublists Of More Than 1 Node

I have been working on a templated implementation of a linked-list, on purpose, to reinvent the wheel, to stumble into just this type problem to help learn the subtle nuances of pointer to class instance handling. 我一直在致力于链表的模板化实现,目的是重新发明轮子,偶然发现这种类型的问题,以帮助学习指向类实例处理的指针的细微差别。 The problem I have stumbled into has to do with merging sublists where on the second merge (the first merge where sublists can have multiple nodes) fails where a prior class instance (either from split or mergesorted ) appears to go out of scope (which should not have any affect on the merge as the pointer assignment is the a prior list that always remains in scope until after the assignment of the original list node has taken place) 我遇到的问题与合并子列表有关,在第二次合并(子列表可以具有多个节点的第一次合并)上失败时,先前的类实例(来自splitmergesorted )似乎超出范围(应该不会对合并产生任何影响,因为指针分配是始终存在于作用域中的先前列表,直到原始列表节点的分配发生后为止)

The key here is that all class instances have pointers to the original nodes from the original list, so as long as the sublist instance remains in scope until the beginning node of the sublist is returned and assigned to the list in the previous recursion. 这里的关键是所有类实例都有指向原始列表中原始节点的指针,只要子列表实例保留在作用域中,直到返回子列表的开始节点并在上一个递归中将其分配给列表。 I am trying to move a perfectly-good 100% working C implementation. 我试图移动一个完美的100%工作C实现。 So it is a problem with my understanding of why I cannot treat class instances as I would a struct in C that is the issue here -- but I cannot put my finger on documentation that explains why. 因此,我的理解是为什么我不能像在C语言中那样处理类实例,这就是问题所在,但是我不能专心解释为什么的文档。

The class list_t contains the struct node_t to form the list. list_t类包含构成列表的结构node_t

/* linked list node */
template <class T>
struct node_t {
    T data;
    node_t<T> *next;
};

template <class T>
class list_t {
    node_t<T> *head, *tail;
    int (*cmp)(const node_t<T>*, const node_t<T>*);

    public:
    list_t (void);                          /* constructors */
    list_t (int(*f)(const node_t<T>*, const node_t<T>*));
    ~list_t (void);                         /* destructor */
    list_t (const list_t&);                 /* copy constructor */
    /* setter for compare function */
    ,,,
    list_t split (void);                    /* split list ~ 1/2 */
    ...
    /* merge lists after mergesort_start */
    node_t<T> *mergesorted (node_t<T> *a, node_t<T> *b);
    void mergesort_run (list_t<T> *l);      /* mergesort function */
    void mergesort (void);                  /* wrapper for mergesort */
};

(yes I know no _t suffix, that's not the point here) (是的,我不知道_t后缀,这不是重点)

The split function is working fine and is: split函数运行正常,并且:

/* split list l into lists a & b */
template <class T>
list_t<T> list_t<T>::split (void)
{
    list_t<T> s;                /* new instance of class */

    node_t<T> *pa = head,       /* pointer to current head */
            *pb = pa->next;     /* 2nd pointer to double-advance */

    while (pb) {                /* while not end of list */
        pb = pb->next;          /* advance 2nd ptr */
        if (pb) {               /* if not nullptr */
            pa = pa->next;      /* advance current ptr */
            pb = pb->next;      /* advance 2nd ptr again */
        }
    }

    s.tail = tail;              /* 2nd half tail will be current tail */
    tail = pa;                  /* current tail is at pa */

    s.head = pa->next;          /* 2nd half head is next ptr */
    pa->next = nullptr;         /* set next ptr NULL to end 1st 1/2 */

    return s;                   /* return new instance */
}

For the mergesort, I have a wrapper that calls the actual mergesort function mergesort_run . 对于mergesort,我有一个包装器,该包装器调用了实际的mergesort函数mergesort_run This was done so updating the tail pointer is only called after the sort completes, eg 这样做是为了仅在排序完成后才调用更新tail指针。

/* wrapper to the actual mergesort routing in mergesort_run */
template <class T>
void list_t<T>::mergesort(void)
{
    mergesort_run (this);

    /* set tail pointer to last node after sort */
    for (node_t<T> *pn = head; pn; pn = pn->next)
        tail = pn;
}

mergesort_run is as follows: mergesort_run如下:

/* split and merge splits in sort order */
template <class T>
void list_t<T>::mergesort_run (list_t<T> *l) 
{ 
    /* Base case -- length 0 or 1 */
    if (!l->head || !l->head->next) { 
        return; 
    } 

    /* Split head into 'a' and 'b' sublists */
    list_t<T> la = l->split(); 

    /* Recursively sort the sublists */
    mergesort_run(l); 
    mergesort_run(&la);

    /* merge the two sorted lists together */
    l->head = mergesorted (l->head, la.head);
}

The merge function, mergesorted merges the sublist in sort order: mergesorted合并功能按排序顺序合并子列表:

template <class T>
node_t<T> *list_t<T>::mergesorted (node_t<T> *a, node_t<T> *b) 
{ 
    node_t<T> *result = nullptr;

    /* Base cases */
    if (!a) 
        return (b); 
    else if (!b) 
        return (a); 

    /* Pick either a or b, and recur */
    if (cmp (a, b) <= 0) { 
        result = a; 
        result->next = mergesorted (a->next, b); 
    } 
    else { 
        result = b; 
        result->next = mergesorted (a, b->next); 
    }

    return result; 
} 

Working C Implementation I am Moving From 我正在从中工作的C实现

Each of the above (other than me splitting out the initial wrapper) is an implementation from the following working C split/mergesort: 上面的每一个(除了我拆分出初始包装器之外)都是以下工作的C拆分/合并排序的实现:

/* split list l into lists a & b */
void split (list_t *l, list_t *a)
{
    node_t  *pa = l->head,
            *pb = pa->next;

    while (pb) {
        pb = pb->next;
        if (pb) {
            pa = pa->next;
            pb = pb->next;
        }
    }

    a->tail = l->tail;
    l->tail = pa;

    a->head = pa->next;
    pa->next = NULL;
}

/* merge splits in sort order */
node_t *mergesorted (node_t *a, node_t *b) 
{ 
    node_t  *res = NULL;

    /* base cases */
    if (!a) 
        return (b); 
    else if (!b) 
        return (a); 

    /* Pick either a or b, and recurse */
    if (a->data <= b->data) { 
        res = a; 
        res->next = mergesorted (a->next, b); 
    } 
    else { 
        res = b; 
        res->next = mergesorted (a, b->next); 
    } 
    return res; 
} 

/* sorts the linked list by changing next pointers (not data) */
void mergesort (list_t *l) 
{ 
    list_t la;
    node_t *head = l->head; 

    /* Base case -- length 0 or 1 */
    if (!head || !head->next) { 
        return; 
    } 

    /* Split head into 'a' and 'b' sublists */
    split (l, &la); 

    /* Recursively sort the sublists */
    mergesort(l); 
    mergesort(&la); 

    /* answer = merge the two sorted lists together */
    l->head = mergesorted (l->head, la.head);

    /* set tail pointer to last node after sort */
    for (head = l->head; head; head = head->next)
        l->tail = head;
}

On 2nd Merge The Nodes From The 1st Merge Vanish 在第2次合并中,第1次合并中的节点消失

I have stepped through the C++ implementation with gdb and valgrind . 我已经使用gdbvalgrind逐步完成了C ++实现。 In gdb the code will complete without error, but in valgrind you have the invalid read of 4 and 8 bytes after a block that has been freed suggesting the destructor is freeing memory (which it should) but that the pointer assignments done as the recursion unwinds has a dependence on the address of the pointer from the nested recursive call instead of just using the values at the address from the original (as the above C code does perfectly) gdb ,代码将完整无误地完成,但是在valgrind ,在valgrind了一个块之后,您将无效读取4个字节和8个字节,这表明析构函数正在释放内存(应该释放),但是在递归展开时完成了指针分配依赖于嵌套递归调用的指针地址,而不仅仅是使用原始地址中的指针值(如上述C代码完美地实现)

What is happening is that after the list is split down to sublists with a single node and the first merge takes place -- we are still good. 发生的情况是,在将列表拆分为具有单个节点的子列表并进行第一次合并之后,我们仍然很好。 When the next unwind happens where you would merge the combined node with another sublist -- the values of the 2-node sublist are lost. 当下一次展开发生时,您将合并的节点与另一个子列表合并-2节点子列表的值将丢失。 So after picking though the C and C++ implementations, I am feeiling like an idiot, because problems I could simply debug/correct in CI am missing some critial understanding that allows me to do the same with a C++ class implementation of the same code. 因此,在选择了C和C ++实现之后,我感觉像个白痴,因为我可以简单地在CI中进行调试/纠正的问题就缺少了一些批判性的理解,这些理解使我无法对同一代码的C ++类实现进行相同的操作。

Test Code 测试代码

int main (void) { int main(void){

    list_t<int> l;

    int arr[] = {12, 11, 10, 7, 4, 14, 8, 16, 20, 19, 
                  2, 9, 1, 13, 17, 6, 15, 5, 3, 18};
    unsigned asz = sizeof arr / sizeof *arr;

    for (unsigned i = 0; i < asz; i++)
        l.addnode (arr[i]);

    l.prnlist();
#ifdef ISORT
    l.insertionsort();
#else
    l.mergesort();
#endif
    l.prnlist();
}

The beginning merge of the left-sublist after it is split down to nodes 12 and 11 goes fine. 在将左子列表拆分为节点1211之后,左子列表的开始合并就可以了。 As soon as I go to merge the 11, 12 sublist with 10 -- the 11, 12 sublist values are gone. 当我去合并11, 12子列表与10 -在11, 12个列表值都没有了。

MCVE MCVE

#include <iostream>

/* linked list node */
template <class T>
struct node_t {
    T data;
    node_t<T> *next;
};

/* default compare function for types w/overload (ascending) */
template <typename T>
int compare_asc (const node_t<T> *a, const node_t<T> *b)
{
    return (a->data > b->data) - (a->data < b->data);
}

/* compare function for types w/overload (descending) */
template <typename T>
int compare_desc (const node_t<T> *a, const node_t<T> *b)
{
    return (a->data < b->data) - (a->data > b->data);
}

template <class T>
class list_t {
    node_t<T> *head, *tail;
    int (*cmp)(const node_t<T>*, const node_t<T>*);

    public:
    list_t (void);                          /* constructors */
    list_t (int(*f)(const node_t<T>*, const node_t<T>*));
    ~list_t (void);                         /* destructor */
    list_t (const list_t&);                 /* copy constructor */
    /* setter for compare function */
    void setcmp (int (*f)(const node_t<T>*, const node_t<T>*));

    node_t<T> *addnode (T data);            /* simple add at end */
    node_t<T> *addinorder (T data);         /* add in order */
    void delnode (T data);                  /* delete node */
    void prnlist (void);                    /* print space separated */

    list_t split (void);                    /* split list ~ 1/2 */

    void insertionsort (void);              /* insertion sort list */

    /* merge lists after mergesort_start */
    node_t<T> *mergesorted (node_t<T> *a, node_t<T> *b);
    void mergesort_run (list_t<T> *l);      /* mergesort function */
    void mergesort (void);                  /* wrapper for mergesort */
};

/* constructor (default) */
template <class T>
list_t<T>::list_t (void)
{
    head = tail = nullptr;
    cmp = compare_asc;
}

/* constructor taking compare function as argument */
template <class T>
list_t<T>::list_t (int(*f)(const node_t<T>*, const node_t<T>*))
{
    head = tail = nullptr;
    cmp = f;
}

/* destructor free all list memory */
template <class T>
list_t<T>::~list_t (void)
{
    node_t<T> *pn = head;
    while (pn) {
        node_t<T> *victim = pn;
        pn = pn->next;
        delete victim;
    }
}

/* copy ctor - copy exising list */
template <class T>
list_t<T>::list_t (const list_t& l)
{
    cmp = l.cmp;                        /* assign compare function ptr */
    head = tail = nullptr;              /* initialize head/tail */

    /* copy data to new list */
    for (node_t<T> *pn = l.head; pn; pn = pn->next)
        this->addnode (pn->data);
}

/* setter compare function */
template <class T>
void list_t<T>::setcmp (int(*f)(const node_t<T>*, const node_t<T>*))
{
    cmp = f;
}

/* add using tail ptr */
template <class T>
node_t<T> *list_t<T>::addnode (T data)
{
    node_t<T> *node = new node_t<T>;        /* allocate/initialize node */
    node->data = data;
    node->next = nullptr;

    if (!head)
        head = tail = node;
    else {
        tail->next = node;
        tail = node;
    }

    return node;
}

template <class T>
node_t<T> *list_t<T>::addinorder (T data)
{
    if (!cmp) {     /* validate compare function not nullptr */
        std::cerr << "error: compare is nullptr.\n";
        return nullptr;
    }

    node_t<T> *node = new node_t<T>;        /* allocate/initialize node */
    node->data = data;
    node->next = nullptr;

    node_t<T> **ppn = &head,                /* ptr-to-ptr to head */
              *pn = head;                   /* ptr to head */

    while (pn && cmp (node, pn) > 0) {      /* node sorts after current */
        ppn = &pn->next;                    /* ppn to address of next */
        pn = pn->next;                      /* advance pointer to next */
    }

    node->next = pn;                        /* set node->next to next */
    if (pn == nullptr)
        tail = node;
    *ppn = node;                            /* set current to node */

    return node;                            /* return node */
}

template <class T>
void list_t<T>::delnode (T data)
{
    node_t<T> **ppn = &head;        /* pointer to pointer to node */
    node_t<T> *pn = head;           /* pointer to node */

    for (; pn; ppn = &pn->next, pn = pn->next) {
        if (pn->data == data) {
            *ppn = pn->next;        /* set address to next */
            delete pn;
            break;
        }
    }
}

template <class T>
void list_t<T>::prnlist (void)
{
    if (!head) {
        std::cout << "empty-list\n";
        return;
    }
    for (node_t<T> *pn = head; pn; pn = pn->next)
        std::cout << " " << pn->data;
    std::cout << '\n';
}

/* split list l into lists a & b */
template <class T>
list_t<T> list_t<T>::split (void)
{
    list_t<T> s;                /* new instance of class */

    node_t<T> *pa = head,       /* pointer to current head */
            *pb = pa->next;     /* 2nd pointer to double-advance */

    while (pb) {                /* while not end of list */
        pb = pb->next;          /* advance 2nd ptr */
        if (pb) {               /* if not nullptr */
            pa = pa->next;      /* advance current ptr */
            pb = pb->next;      /* advance 2nd ptr again */
        }
    }

    s.tail = tail;              /* 2nd half tail will be current tail */
    tail = pa;                  /* current tail is at pa */

    s.head = pa->next;          /* 2nd half head is next ptr */
    pa->next = nullptr;         /* set next ptr NULL to end 1st 1/2 */

    return s;                   /* return new instance */
}

/** insertion sort of linked list.
 *  re-orders list in sorted order.
 */
template <class T>
void list_t<T>::insertionsort (void) 
{ 
    node_t<T> *sorted = head,       /* initialize sorted list to 1st node */
              *pn = head->next;     /* advance original list node to next */

    sorted->next = NULL;            /* initialize sorted->next to NULL */

    while (pn) {                    /* iterate over existing from 2nd node */
        node_t<T> **pps = &sorted,  /* ptr-to-ptr to sorted list */
                *ps = *pps,         /* ptr to sorted list */
                *next = pn->next;   /* save list next as separate pointer */

        while (ps && cmp(ps, pn) < 0) {  /* loop until sorted */
            pps = &ps->next;        /* get address of next node */
            ps = ps->next;          /* get next node pointer */
        }

        *pps = pn;          /* insert existing in sort order as current */
        pn->next = ps;      /* set next as sorted next */
        pn = next;          /* reinitialize existing pointer to next */
    }

    head = sorted;          /* update head to sorted head */

    /* set tail pointer to last node after sort */
    for (pn = head; pn; pn = pn->next)
        tail = pn;
}

/* FIXME mergesort recursion not working */
template <class T>
node_t<T> *list_t<T>::mergesorted (node_t<T> *a, node_t<T> *b) 
{ 
    node_t<T> *result = nullptr;

    /* Base cases */
    if (!a) 
        return (b); 
    else if (!b) 
        return (a); 

    /* Pick either a or b, and recur */
    if (cmp (a, b) <= 0) { 
        result = a; 
        result->next = mergesorted (a->next, b); 
    } 
    else { 
        result = b; 
        result->next = mergesorted (a, b->next); 
    }

    return result; 
} 

/* split and merge splits in sort order */
template <class T>
void list_t<T>::mergesort_run (list_t<T> *l) 
{ 
    /* Base case -- length 0 or 1 */
    if (!l->head || !l->head->next) { 
        return; 
    } 

    /* Split head into 'a' and 'b' sublists */
    list_t<T> la = l->split(); 

    /* Recursively sort the sublists */
    mergesort_run(l); 
    mergesort_run(&la);

    /* merge the two sorted lists together */
    l->head = mergesorted (l->head, la.head);
}

/* wrapper to the actual mergesort routing in mergesort_run */
template <class T>
void list_t<T>::mergesort(void)
{
    mergesort_run (this);

    /* set tail pointer to last node after sort */
    for (node_t<T> *pn = head; pn; pn = pn->next)
        tail = pn;
}

int main (void) {

    list_t<int> l;

    int arr[] = {12, 11, 10, 7, 4, 14, 8, 16, 20, 19, 
                  2, 9, 1, 13, 17, 6, 15, 5, 3, 18};
    unsigned asz = sizeof arr / sizeof *arr;

    for (unsigned i = 0; i < asz; i++)
        l.addnode (arr[i]);

    l.prnlist();
#ifdef ISORT
    l.insertionsort();
#else
    l.mergesort();
#endif
    l.prnlist();
}

Result of Insertion Sort -- Expected Results 插入排序的结果-预期结果

Compile with -DISORT to test insertion sort (working): 使用-DISORT进行编译以测试插入排序(有效):

$ ./bin/ll_merge_post
 12 11 10 7 4 14 8 16 20 19 2 9 1 13 17 6 15 5 3 18
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Result of Mergesort -- Not Good Mergesort的结果-不好

$ ./bin/ll_merge_post
 12 11 10 7 4 14 8 16 20 19 2 9 1 13 17 6 15 5 3 18
 0 16108560 16108656 16108688 16108560 16108816 16108784 16108848 16108752 16108720 16109072 16108976 16108944 16109008 16108880 16108912 16109136 16109104 16109168 16109040

So I'm stuck. 所以我被卡住了。 (and it is probably something simple I should see but don't) Why is the merging of the sublists failing? (这应该是我应该看到的简单现象,但不是),为什么子列表合并失败? What is the critical piece of understanding of class instance in C++ verses C struct handling I'm missing? 我缺少的C ++和C结构处理中对类实例的关键理解是什么?

In mergesort_run , you have a local list la that contains half of your source list. mergesort_run ,您有一个本地列表la ,其中包含源列表的一半。 At the end of the function you merge the content of la back into the new list, but the variable itself still points at the nodes you merged. 在函数末尾,您可以将la的内容重新合并到新列表中,但是变量本身仍指向您合并的节点。 When the destructor for la is run, these nodes will be deleted. 运行la的析构函数时,这些节点将被删除。

If you set the head node of la to a NULL pointer ( la.head = nullptr ) after doing the merge, then when the destructor runs there aren't any nodes for it to delete. 如果在合并之后将la的头节点设置为NULL指针( la.head = nullptr ),则在析构函数运行时,将没有要删除的节点。

One unrelated issue is that you don't copy cmp in places when creating a new list (like split ). 一个不相关的问题是,在创建新列表(例如split )时,您不会在地方复制cmp

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM