简体   繁体   中英

How to move the lifetime of references outside a scope in Rust

Actually I try to implement the following functionality in Rust.

I want to have a structure Node which has a vector to some other Node structures. In addition I have a master vector which keeps all the Node structures which have been instantiated.

The key point here is that the Nodes are allocated within a loop (ie an own scope) and the master vector keeping all the structures (or references to the structures) is declared outside the loop which is in my opinion a 0815 use case.

After a lot of trying I came up with this code which still does not compile. Actually I tried it with just &Node and alternatively with RefCell<&Node>, both do not compile.

struct Node<'a> {
    name: String,
    nodes: RefCell<Vec<&'a Node<'a>>>,
}

impl<'a> Node<'a> {

    fn create(name: String) -> Node<'a> {
        Node {
            name: name,
            nodes: RefCell::new(Vec::new()),
        }
    }
    
    fn add(&self, value: &'a Node<'a>) {
        self.nodes.borrow_mut().push(value);
    }

    fn get_nodes(&self) -> Vec<&'a Node> {
        self.nodes.take()
    }
}


// Later the code ...

    let mut the_nodes_ref: HashMap<String, RefCell<&Node>> = HashMap::new();
    let mut the_nodes_nodes: HashMap<String, &Node> = HashMap::new();

    // This works
    let no1_out = Node::create(String::from("no1"));
    let no2_out = Node::create(String::from("no2"));

    no1_out.add(&no2_out);
    no2_out.add(&no1_out);

    the_nodes_nodes.insert(no1_out.name.clone(), &no1_out);
    the_nodes_nodes.insert(no2_out.name.clone(), &no2_out);

    let no1_ref_out = RefCell::new(&no1_out);
    let no2_ref_out = RefCell::new(&no2_out);

    the_nodes_ref.insert(no1_out.name.clone(), no1_ref_out);
    the_nodes_ref.insert(no2_out.name.clone(), no2_ref_out);

    // This works not because no1 and no2 do not live long enough
    let items = [1, 2, 3];
    for _ in items {
        let no1 = Node::create(String::from("no1"));
        let no2 = Node::create(String::from("no2"));

        no1.add(&no2); // <- Error no2 lives not long enough
        no2.add(&no1); // <- Error no1 lives not long enough

        the_nodes_nodes.insert(no1.name.clone(), &no1);
        the_nodes_nodes.insert(no2.name.clone(), &no2);

        let no1_ref = RefCell::new(&no1);
        let no2_ref = RefCell::new(&no2);

        the_nodes_ref.insert(no1.name.clone(), no1_ref);
        the_nodes_ref.insert(no2.name.clone(), no2_ref);
    }

I kind of understand the problem, but I am wondering how to solve this problem. How can I allocate a structure within an separate scope (here the for loop) and then use the allocated structures outside the for loop. I mean it is a common use case to allocate a structure within a loop and use it later outside of the loop.

Somehow I have the feeling that the missing link is to tell the Rust Compiler via the lifetime parameters, that the references should also stay alive outside the for loop but I have no idea how to do that. But maybe this is also not the correct way to do it....

Actually another key point here is that I want that the Nodes have references to the other Nodes and not copies of the Nodes. The same is true for the master vector, this vector should have references to the allocated Nodes and not copies of the Nodes.

All this boils down to the answer to a single question: which entity in the program should own the Node values?

Right now main() owns the values, and you know this because everything else in the program only has &Node , which is a reference to something owned by something else. This is why the loop variant fails, because no1 and no2 are the owned values, but they are destroyed at the end of each loop iteration, so you have dangling references in your maps.

One way to solve this problem is to have a collection own the values. However, due to Rust's borrowing rules, you will not be able to modify the collection once you start giving out references, because that requires borrowing the collection mutably. So you'd have to create all your nodes up front, put them in the collection, and then start giving references out to the other nodes. This is the most efficient way to solve the problem, but is inflexible and binds the lifetime of all the nodes together. In real code, nodes may come and go so having them share a lifetime is impractical.

The classic solution to this problem is shared ownership via Rc , but that comes with its own set of problems where you have nodes referencing each other. In that case, you might leak node objects even if you drop them from the global collection, because they still reference each other.

This is where weak references come in, which allow you to refer to another value maintained by an Rc but not prevent it from being collected. However, a value in an Rc can't be mutated if two or more references exist to the same value, so adding the weak references to the nodes requires interior mutability via RefCell .

Let's put all this together:

use std::collections::HashMap;
use std::rc::{Rc, Weak};
use std::cell::RefCell;

struct Node {
    name: String,
    nodes: RefCell<Vec<Weak<Node>>>,
}

impl Node {
    fn new(name: String) -> Self {
        Node { name, nodes: RefCell::new(Vec::new()) }
    }
    
    fn name(&self) -> &String {
        &self.name
    }
    
    fn add(&self, value: Weak<Node>) {
        self.nodes.borrow_mut().push(value);
    }

    fn get_nodes(&self) -> Vec<Rc<Node>> {
        // Return strong references.  While we are doing this, clean out
        // any dead weak references.
        let mut strong_nodes = Vec::new();
        
        self.nodes.borrow_mut().retain(|w| match w.upgrade() {
            Some(v) => {
                strong_nodes.push(v);
                true
            },
            None => false,
        });
        
        strong_nodes
    }
}

fn main() {
    let mut the_nodes_nodes: HashMap<String, Rc<Node>> = HashMap::new();

    let items = [1, 2, 3];
    for _ in items {
        let no1 = Rc::new(Node::new(String::from("no1")));
        let no2 = Rc::new(Node::new(String::from("no2")));
        
        // downgrade creates a new Weak<T> for an Rc<T>
        no1.add(Rc::downgrade(&no2));
        no2.add(Rc::downgrade(&no1));
        
        for n in [no1, no2] {
            the_nodes_nodes.insert(n.name().clone(), n);
        }
    }
}

The nodes are strongly-referenced by the_nodes_nodes , which will keep them alive, but we can dispense further Rc or Weak instances that refer to the same node without needing to manage lifetimes nearly as strictly.

Note that when a Node is destroyed because it's removed from the map, existing Weak references to that node will no longer be valid. You must invoke upgrade() on Weak references which will give you back an Rc only if the Node value is still alive. The get_nodes() method wraps up this logic by returning an IntoIterator that strongly references only the nodes that are still alive.


For the sake of completeness, here is what the non- Rc option would look like. There is a helper struct Nodes to hold the map.

use std::collections::HashMap;
use std::cell::RefCell;

struct Node<'a> {
    name: String,
    nodes: RefCell<Vec<&'a Node<'a>>>,
}

impl<'a> Node<'a> {
    fn new(name: String) -> Self {
        Node { name, nodes: RefCell::new(Vec::new()) }
    }
    
    fn name(&self) -> &String {
        &self.name
    }
    
    fn add(&self, value: &'a Node<'a>) {
        self.nodes.borrow_mut().push(value);
    }
    
    fn get_nodes(&self) -> Vec<&'a Node<'a>> {
        self.nodes.borrow().clone()
    }
}

struct Nodes<'a> {
    nodes: HashMap<String, Node<'a>>,
}

impl<'a> Nodes<'a> {
    fn new<T: IntoIterator<Item=String>>(node_names: T) -> Self {
        let mut nodes = HashMap::new();
        
        for name in node_names {
            nodes.insert(name.clone(), Node::new(name));
        }
        
        Self { nodes }
    }
    
    fn get_node(&'a self, name: &String) -> Option<&'a Node<'a>> {
        self.nodes.get(name)
    }
}

fn main() {
    let nodes = Nodes::new(["n1".to_string(), "n2".to_string()]);
    
    let n1 = nodes.get_node(&"n1".to_string()).expect("n1");
    let n2 = nodes.get_node(&"n2".to_string()).expect("n2");
    
    n1.add(n2);
    n2.add(n1);
}

Note that we have to create all nodes in advance. Creating a node requires borrowing the HashMap mutably, which we can't do while there is a reference to a value in the map. The Nodes type makes this clear by requiring an iterator of node names to create in its constructor function; adding new nodes later isn't permitted by the API.

We cannot obtain a mutable reference to a node while we hold a reference to any other node, so this approach also requires interior mutability ( RefCell ) for each node's node list and simply doesn't provide an API for obtaining a mutable reference to a node.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM