简体   繁体   中英

Cassandra connections - Shared 'Cluster' instance or multiple?

When using the Cassandra driver within a Java project, what is the best practice for managing connections? Specifically with respect to whether it's better practice to allow multiple threads to share a single Cluster instance or to allocate a separate Cluster instance for each thread that needs to talk to Cassandra.

I followed the example code and am setting up my Cluster instance like:

Cluster.builder().addContactPoint(HOST).withPort(PORT)
    .withCredentials(USER, PASS).build();

So what I'm asking is would the preferred approach be to do something like this (single shared Cluster instance):

private static Cluster _cluster = null;

public static Cluster connect() {
    if (_cluster != null && ! _cluster.isClosed()) {
        //return the cached instance
        return _cluster;
    }

    //create a new instance
    _cluster = Cluster.builder().addContactPoint(HOST).withPort(PORT)
                .withCredentials(USER, PASS).build();
    return _cluster;
}

...or is it best practice to return multiple Cluster instances? Like so:

public static Cluster connect() {
    //every caller gets their own Cluster instance
    return Cluster.builder().addContactPoint(HOST).withPort(PORT)
                .withCredentials(USER, PASS).build();
}

I guess the points at the core of this question are:

  • Is building a new Cluster instance an expensive operation?
  • Will the Cluster object internally manage/pool connections to the backing datastore, or does it function more like an abstraction of a single connection?
  • Is the Cluster object thread-safe?

Is building a new Cluster instance an expensive operation?

Calling build to construct a Cluster instance does no networking IO, so it is a non-expensive operation.

Will the Cluster object internally manage/pool connections to the backing datastore, or does it function more like an abstraction of a single connection?

What is expensive is calling cluster.init() which creates a single connection (control connection) to one of your contact points. cluster.connect() is even more expensive since it inits the Cluster (if it hasn't been already) and creates a Session which manages a connection pool (with pool size based on your PoolingOptions ) to each discovered host. So yes, a Cluster has a 'control connection' to manage the state of hosts and each Session created via Cluster.connect() will have a connection pool to each host.

Is the Cluster object thread-safe?

Put simply, yes :)

4 simple rules when using the DataStax drivers for Cassandra provides further guidance on this topic.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM