简体   繁体   中英

How to avoid a deadlock caused by a thread panic?

My server uses a Barrier to notify the client when it's safe to attempt to connect. Without the barrier, we risk failing randomly as there is no guarantee that the server socket would have been bound.

Now imagine that the server panics - for instance tries to bind the socket to port 80. The client will be left wait() -ing forever. We cannot join() the server thread in order to find out if it panicked, because join() is a blocking operation - if we join() we won't be able to connect() .

What's the proper way to do this kind of synchronization, given that the std::sync APIs do not provide methods with timeouts?

This is just a MCVE to demonstrate the issue. I had a similar case in a unit test - it was left running forever.

use std::{
    io::prelude::*,
    net::{SocketAddr, TcpListener, TcpStream},
    sync::{Arc, Barrier},
};

fn main() {
    let port = 9090;
    //let port = 80;

    let barrier = Arc::new(Barrier::new(2));
    let server_barrier = barrier.clone();

    let client_sync = move || {
        barrier.wait();
    };

    let server_sync = Box::new(move || {
        server_barrier.wait();
    });

    server(server_sync, port);
    //server(Box::new(|| { no_sync() }), port); //use to test without synchronisation

    client(&client_sync, port);
    //client(&no_sync, port); //use to test without synchronisation
}

fn no_sync() {
    // do nothing in order to demonstrate the need for synchronization
}

fn server(sync: Box<Fn() + Send + Sync>, port: u16) {
    std::thread::spawn(move || {
        std::thread::sleep_ms(100); //there is no guarantee when the os will schedule the thread. make it 100% reproducible
        let addr = SocketAddr::from(([127, 0, 0, 1], port));
        let socket = TcpListener::bind(&addr).unwrap();
        println!("server socket bound");
        sync();

        let (mut client, _) = socket.accept().unwrap();

        client.write_all(b"hello mcve").unwrap();
    });
}

fn client(sync: &Fn(), port: u16) {
    sync();

    let addr = SocketAddr::from(([127, 0, 0, 1], port));
    let mut socket = TcpStream::connect(&addr).unwrap();
    println!("client socket connected");

    let mut buf = String::new();
    socket.read_to_string(&mut buf).unwrap();
    println!("client received: {}", buf);
}

Instead of a Barrier I would use a Condvar here.

To actually solve your problem, I see at least three possible solutions:

  1. Use Condvar::wait_timeout and set the timeout to a reasonable duration (eg 1 second which should be enough for binding to a port)
  2. You could use the same method as above, but with a lower timeout (eg 10 msec) and check if the Mutex is poisoned.
  3. Instead of a Condvar , you could use a plain Mutex (make sure that the Mutex is locked by the other thread first) and then use Mutex::try_lock to check if the Mutex is poisoned

I think one should prefer solution 1 or 2 over the third one, because you will avoid to make sure that the other thread has locked the Mutex first.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM