简体   繁体   中英

delete an entry from a singly-linked list

So today I was watching The mind behind Linux | Linus Torvalds , Linus posted two pieces of code in the video, both of them are used for removing a certain element in a singly-linked list.

The first one (which is the normal one):

void remove_list_entry(linked_list* entry) {
    linked_list* prev = NULL;
    linked_list* walk = head;
    while (walk != entry) {
        prev = walk;
        walk = walk->next;
    }
    if (!prev) {
        head = entry->next;
    } else {
        prev->next = entry->next;
    }
}

And the better one:

void remove_list_entry(linked_list* entry) {
    // The "indirect" pointer points to the
    // *address* of the thing we'll update
    linked_list** indirect = &head;

    // Walk the list, looking for the thing that
    // points to the entry we want to remove
    while ((*indirect) != entry)
        indirect = &(*indirect)->next;

    // .. and just remove it
    *indirect = entry->next;
}

So I cannot understand the second piece of code, what happens when *indirect = entry->next;evaluates? I cannot see why it leads to the remove of the certain entry. Someone explains it please, thanks!

what happens when *indirect = entry->next; evaluates? I cannot see why it leads to the remove of the certain entry.

I hope you have clear understanding of double pointers 1) .

Assume following:
Node structure is

typedef struct Node {
    int data;
    struct Node *next;
} linked_list;

and linked list is having 5 nodes and the entry pointer pointing to second node in the list. The in-memory view would be something like this:

                          entry -+
   head                          |
      +---+     +-------+     +-------+     +-------+     +-------+     +--------+
      |   |---->| 1 |   |---->| 2 |   |---->| 3 |   |---->| 4 |   |---->| 5 |NULL|
      +---+     +-------+     +-------+     +-------+     +-------+     +--------+

This statement:

linked_list** indirect = &head;

will make indirect pointer pointing to head .

                         entry -+
  head                          |
     +---+     +-------+     +-------+     +-------+     +-------+     +--------+
     |   |---->| 1 |   |---->| 2 |   |---->| 3 |   |---->| 4 |   |---->| 5 |NULL|
     +---+     +-------+     +-------+     +-------+     +-------+     +--------+
       ^
       |
     +---+
     |   |
     +---+
   indirect

The while loop

    while ((*indirect) != entry)

*indirect will give the address of first node because head is pointing to first node and since entry is pointing to second node the loop condition evaluates to true and following code will execute:

indirect = &(*indirect)->next;

this will make the indirect pointer pointing to the next pointer of first node. The in-memory view:

                          entry -+
   head                          |
      +---+     +-------+     +-------+     +-------+     +-------+     +--------+
      |   |---->| 1 |   |---->| 2 |   |---->| 3 |   |---->| 4 |   |---->| 5 |NULL|
      +---+     +-------+     +-------+     +-------+     +-------+     +--------+
                      ^
                      |
                    +---+
                    |   |
                    +---+
                  indirect

now the while loop condition will be evaluated. Because the indirect pointer is now pointing to next of first node, the *indirect will give the address of second node and since entry is pointing to second node the loop condition evaluates to false and the loop exits.
The following code will execute now:

*indirect = entry->next;

The *indirect dereference to next of first node and it is now assigned the next of node which entry pointer is pointing to. The in-memory view:

                          entry -+
   head                          |
      +---+     +-------+     +-------+     +-------+     +-------+     +--------+
      |   |---->| 1 |   |--   | 2 |   |---->| 3 |   |---->| 4 |   |---->| 5 |NULL|
      +---+     +-------+  \  +-------+     +-------+     +-------+     +--------+
                  *indirect \              /
                             +------------+

Now the next of first node is pointing to third node in the list and that way the second node is removed from the list.

Hope this clear all of your doubts.


EDIT :

David has suggested, in comment, to add some details around - why are the (..) parenthesis required in &(*indirect)->next ?

The type of indirect is linked_list ** , which means it can hold the address of pointer of type linked_list * . The *indirect will give the pointer of type linked_list * and ->next will give its next pointer.
But we cannot write *indirect->next because the precedence of operator -> is higher than unary * operator. So, *indirect->next will be interpreted as *(indirect->next) which is syntactically wrong because indirect is a pointer to pointer. Hence we need () around *indirect .

Also, &(*indirect)->next will be interpreted as &((*indirect)->next) , which is the address of the next pointer.


1) If you don't know how double pointer works, check below:

Lets take an example:

#include <stdio.h>

int main() {
        int a=1, b=2;
        int *p = &a;
        int **pp = &p;

        printf ("1. p : %p\n", (void*)p);
        printf ("1. pp : %p\n", (void*)pp);
        printf ("1. *p : %d\n", *p);
        printf ("1. *pp : %d\n", **pp);

        *pp = &b;  // this will change the address to which pointer p pointing to
        printf ("2. p : %p\n", (void*)p);
        printf ("2. pp : %p\n", (void*)pp);
        printf ("2. *p : %d\n", *p);
        printf ("2. *pp : %d\n", **pp);

        return 0;
}

In the above code, in this statement - *pp = &b; , you can see that without accessing pointer p directly we can change the address it is pointing to using a double pointer pp , which is pointing to pointer p , because dereferencing the double pointer pp will give pointer p .

Its output:

1. p : 0x7ffeedf75a38
1. pp : 0x7ffeedf75a28
1. *p : 1
1. *pp : 1
2. p : 0x7ffeedf75a34   <=========== changed 
2. pp : 0x7ffeedf75a28
2. *p : 2
2. *pp : 2

In-memory view would be something like this:

//Below in the picture
//100 represents 0x7ffeedf75a38 address
//200 represents 0x7ffeedf75a34 address
//300 represents 0x7ffeedf75a28 address

int *p = &a
      p         a
      +---+     +---+
      |100|---->| 1 |
      +---+     +---+

        int **pp = &p;

      pp        p         a
      +---+     +---+     +---+
      |300|---->|100|---->| 1 |
      +---+     +---+     +---+


*pp = &b;

      pp        p         b
      +---+     +---+     +---+
      |300|---->|200|---->| 2 |
      +---+     +---+     +---+
                ^^^^^     ^^^^^

The entry isn't really "deleted", it's just no longer in the list. If this is your chain:

A --> B --> C --> D --> E --> ■

And you want to delete C, you're really just linking over it. It's still there in memory, but no longer accessible from your data structure.

            C 
A --> B --------> D --> E --> ■

That last line sets the next pointer of B to D instead of C.

Instead of looping through the entries in the list, as the first example does, the second example loops through the pointers to the entries in the list. That allows the second example to have the simple conclusion with the statement you've asked about, which in English is "set the pointer that used to point to the entry I want to remove from the list so that it now points to whatever that entry was pointing to". In other words, it makes the pointer that was pointing to the entry you're removing point past the entry you're removing.

The first example has to have a special way to handle the unique case of the entry you want to remove being the first entry in the list. Because the second example loops through the pointers (starting with &head), it doesn't have a special case.

*indirect = entry->next; That just move it to the next node You need to remove the entry one So you have to point .. before entry node the next of the entry node So your loop should stop before the entry while ((*indirect)->next != entry) indirect = &(*indirect)->next

(*indirect)->Next =entry-> next

I hope that help you

This will be much easier to understand if you rewrite indirect = &(*indirect)->next; As Indirect = &((*indirect)->next);

The while loop will give us the address of a next pointer belong to some node of which the next pointer is pointing to the entry.So the last statement is actually changing the value of this next pointer so that it doesn't point to the entry anymore. And in the special case when the entry is head,the while loop will be skipped and the last line change the value of the head pointer and make it point to the next node of the entry

This example is both a great way of manipulating linked list structures in particular, but also a really excellent way of demonstrating the power of pointers in general.

When you delete an element from a singly-linked list, you have to make the previous node point to the next node, bypassing the node you're deleting. For example, if you're deleting node E , then whatever list pointer it is that used to point to E , you have to make it point to whatever E.next points to.

Now, the problem is that there are two possibilities for "whatever list pointer it is that used to point to E ". Much of the time, it's some previous node's next pointer that points to E . But if E happens to be the first node in the list, that means there's no previous node in the list, and it's the top-level list pointer that points to E — in Linus's example, that's the variable head .

So in Linus's first, "normal" example, there's an if statement. If there's a previous node, the code sets prev->next to point to the next node. But if there's no previous node, that means it's deleting the node at the head of the list, so it sets head to point to the next node.

And although that's not the end of the world, it's two separate assignments and an if condition to take care of what we thought of in English as "whatever list pointer it is that used to point to E ". And one of the crucial hallmarks of a good programmer is an unerring sense for sniffing out needless redundancy like this and replacing it with something cleaner.

In this case, the key insight is that the two things we might want to update, namely head or prev->next , are both pointers to a list node, or linked_list * . And one of the things that pointers are great at is pointing at a thing we care about, even if that thing might be, depending on circumstances, one of a couple of different things.

Since the thing we care about is a pointer to a linked_list , a pointer to the thing we care about will be a pointer to a pointer to a linked_list , or linked_list ** .

And that's exactly what the variable indirect is in Linus's "better" example. It is, literally, a pointer to "whatever list pointer it is that used to point to E " (or, in the actual code, not E , but the passed-in entry being deleted). At first, the indirect pointer points to head , but later, after we've begun walking through the list to find the node to delete, it points at the next pointer of the node (the previous node) that points at the one we're looking at. So, in any case, *indirect (that is, the pointer pointed to by indirect ) is the pointer we want to update. And that's precisely what the magic line

*indirect = entry->next;

does in the "better" example.

The other thing to notice (although this probably makes the code even more cryptic at first) is that the indirect variable also takes the place of the walk variable used in the first example. That is, everywhere the first example used walk , the "better" example uses *indirect . But that makes sense: we need to walk over all the nodes in the list, looking for entry . So we need a pointer to step over those nodes — that's what the walk variable did in the first example. But when we find the entry we want to delete, the pointer to that entry will be "whatever list pointer it is that used to point to E " — and it will be the pointer to update. In the first example, we couldn't set walk to prev->next — that would just update the local walk variable, not head or one of the next pointers in the list. But by using the pointer indirect to (indirectly) walk the list, it's always the case that *indirect — that is, the pointer pointed to by indirect — is the original pointer to the node we're looking at (not a copy sitting in walk ), meaning it's something we can usefully update by saying *indirect = entry->next .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM