简体   繁体   中英

CompletableFuture injection from the inside

Is it possible to inject a CompletableFuture in a CompletableFuture chain from the inside ?

I'm working with a function like this:

public CompletableFuture<Boolean> getFutureOfMyLongRunningTask() {
    CompletableFuture<Boolean> future = CompletableFuture.supplyAsync(() -> {
        // ... Some processing here ...
        if (somecondition failed)
            return false; // Task failed!

        return true; // OK
    }).thenApplyAsync((Boolean result) -> {
        if (!result) // check of previous stage fail
            return false;

        // ... Some processing here ...

        if (!some condition satisfied) {
            // This is where I want the injection to happen. 
            // This stage should be suspended and a new stage should be injected between this point and the next stage.
        }

        return true; // OK
    }).thenApplyAsync((Boolean result) -> {
        if (!result) // check of previous stage fail
            return false;

        // ... Some processing here ...

        return true; // OK
    });

    // This is the result we have to wait for.
    return future;
}

At the injection point if (!some condition satisfied) , I'd like to run a query that (say) takes 5 seconds to execute and retrieve some data needed by the final stage. I don't want to block the thread for 5 seconds, eg make the query synchronous inside the if , I want it to run asynchronous, and when the results are back go straight to the next stage. The problem I'm having is that the condition is known only inside the chain.

Anyone have ideas on this?


EDIT

I'll try to clarify the question. I originally had one piece of code only. Now I'm trying to optimize the code so that I spawn a lower number of threads.

The point is that at the injection point I want to issue something like (sorry, Datastax Java driver for Cassandra code snippet):

ResultSetFuture rsFuture = session.executeAsync(query);

and inject that future into the chain. This would make the calling thread "free" of performing other things instead of sitting and waiting for the results.


I don't know if I can make it more clear than this, but let's follow this example.

I run a loop in the main thread:

for (int i = 0; i < 1000; i++) {
    getFutureOfMyLongRunningTask(i);
}

This loop lives on the main thread only, but each call to the function enqueues a new task in a thread pool P . Assume now that P is a fixed thread pool of size 1 . That means only one thread exists in P , and it is able to process 1 task only. The main loop, however, will enqueue all 1000 tasks. The main loop then will need to wait for all tasks to finish.

Now suppose that the 1st task out of 1000 needs to perform a long DB query. We have now two choices:

  1. The query is executed sync inside the processing thread (belonging to the thread pool P ). That means I simply issue the query inside the if (!some condition satisfied) block and wait for the results. This effectively blocks the tasks processing because the thread pool P have no free threads. The only one is busy blocked on IO.

  2. The query is executed async inside the processing thread (belonging to the thread pool P ).That means I issue the query inside the if (!some condition satisfied) block and get immediately back a future that I will listen to (probably the DB driver will spawn another thread and block that thread waiting for the results). However, the thread belonging to P now is free to process at least another task.

In my opinion, option 2 is better than option 1 , and the same reasoning can be applied with thread pools of size > 1 or dynamic size.

All I want is keep the thread pool as free as possible to spawn the lowest number of threads in order to avoid wasting resources.

Hope that makes sense. If not, please could you explain where I'm wrong?

Instead of using thenApplyAsync , use thenCompose or thenComposeAsync , which lets the function return a CompletableFuture<Foo> instead of a Foo . And instead of return true if some condition is satisfied, you'll need to return CompletableFuture.completedFuture(true) .

public CompletableFuture<Boolean> getFutureOfMyLongRunningTask() {
    CompletableFuture<Boolean> future = CompletableFuture.supplyAsync(() -> {
        // ... Some processing here ...
        if (somecondition failed)
            return false; // Task failed!

        return true; // OK
    }).thenComposeAsync((Boolean result) -> {
        if (!result) // check of previous stage fail
            return CompletableFuture.completedFuture(false);

        // ... Some processing here ...

        if (!some condition satisfied) {
            return runSomeOtherQuery()
        }

        return CompletableFuture.completedFuture(true); // OK
    }).thenApplyAsync((Boolean result) -> {
        if (!result) // check of previous stage fail
            return false;

        // ... Some processing here ...

        return true; // OK
    });

    // This is the result we have to wait for.
    return future;
}


public CompletableFuture<Boolean> runSomeOtherQuery() {
    ....
}

It seems you are thinking that splitting work between chained stages (with “async” involved) somehow magically adds a concurrency improvement to you program logic.

When you chain stages, you are creating a direct, sequential dependency, even if you use one of the “async” methods, as the execution of the subsequent, dependent stage doesn't start before the previous has completed. So this kind of chaining adds the chance of expensive thread hopping, ie that a different thread executes the next stage, but without raising concurrency, as there still will be at most one thread processing one of your stages. Actually, the still possible scenario that the same thread happens to execute all stages, is likely to be the fastest execution.

There is a much simpler, natural way of expressing dependencies. Just write the actions one after another in one piece of code. You can still may schedule that code block for asynchronous execution. So if your starting point is

public CompletableFuture<Boolean> getFutureOfMyLongRunningTask() {
    CompletableFuture<Boolean> future = CompletableFuture.supplyAsync(() -> {
        // First stage processing here ...
        if (somecondition failed)
            return false; // Task failed!

        return true; // OK
    }).thenApplyAsync((Boolean result) -> {
        if (!result) // check of previous stage fail
            return false;

        // Second stage processing here ...

        if (!some condition satisfied) {
            // This is where I want the injection to happen. 
            // This stage should be suspended and a new stage should be
            // injected between this point and the next stage.
        }

        return true; // OK
    }).thenApplyAsync((Boolean result) -> {
        if (!result) // check of previous stage fail
            return false;

        // Third stage processing here ...

        return true; // OK
    });

    // This is the result we have to wait for.
    return future;
}

just change it to

public CompletableFuture<Boolean> getFutureOfMyLongRunningTask() {
    CompletableFuture<Boolean> future = CompletableFuture.supplyAsync(() -> {
        // First stage processing here ...
        if (somecondition failed)
            return false; // Task failed!
        // Second stage processing here ...

        if (!some condition satisfied) {
            // alternative "injected" stage processing
            if(injected stage failed)
                return false;
        }
        // Third stage processing here ...

        return true; // OK
    });

    // This is the result we have to wait for.
    return future;
}

which is shorter and clearer. You don't have to repeatedly check for a previous stage's success. You still have the same concurrency, but a potentially more efficient execution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM