简体   繁体   中英

Basic Hazelcast distributed computing concepts

I've read through Hazelcast documentation ( http://hazelcast.org/docs/latest/manual/html ), up through the Distributed Computing section, but I'm not clear on certain basic ideas. I wish to use IExecutorService to send off several Runnable or Callable instances to both multiple threads on the local machine and other compute nodes in a cluster. I have several questions:

  1. Do I need to create new Hazelcast instances inside the Runnables/Callables?
  2. What is the effect of creating instances inside Runnables/Callables versus creating them in the master thread?
  3. Will the IExecutorService do all the choosing about what nodes and threads to send off to?
  4. do I have to worry about how the new Hazelcast instances I create will bind to threads and nodes, or does this happen automatically?

Thanks!

I talked with the folks from Hazelcast and found that I had some fundamental misunderstandings about how it works. I didn't understand that you have to deploy Hazelcast like a service or daemon by running "com.hazelcast.examples.StartServer" on the compute nodes. This is how the nodes become aware of, and interact with each other. The Hazelcast zip includes some batch and shell scripts for this purpose. Perhaps this is obvious to others, but I did not find anything in the documentation that explicitly spells this out. All I got from the docs was that I drop the jar in my classpath which gives me access to all the classes and methods. I did not know how I was supposed to prepare compute nodes to be aware of each other.

My first two questions above came from the Hazelcast documentation, where in the first example of the Distributed Computing section, they create a new Hazelcast instance inside the Callable. I'm not sure why they do this, but it was extremely misleading to me. I thought it meant that I needed to create and associate a new Hazelcast instance with each thread.

Pveentjer above answers question 3. The answer is basically, yes, if you want it to.

Question 4 is just due to my confusion about how Hazelcast works. Basically, there is not a one-to-one mapping between Hazelcast instances and threads as I thought. Each Hazelcast instance is already Multithreaded, so there is no need to create more than one instance on one node for parallel processing reasons (but you may want to do this for other reasons (Heap space limitations, etc.). Of course, you definitely have to deploy Hazelcast on all compute nodes (for which I use StartServer mentioned above).

In a nutshell, I was able to create a compute cluster rather easily by simply

1) dropping the Hazelcast jar into my classpath on the master node (or including it in my Eclipse project)

2) deploying Hazelcast on compute nodes using the batch file with the StartServer call

3) Creating a Runnable and making it Serializable (along with all it's dependencies)

4) creating a Hazelcast instance in my Main() method and obtaining an IExecutorService to execute my Runnable instances

The only other important step is to make sure that when you deploy StartServer on the compute nodes, you put both the Hazelcast jar, and all jars containing the definitions of your Runnable and all the classes on which it depends, in your classpath.

Below is a simple example:

public class myRunnable implements Runnable, Serializable {

myTestClass mclass;

public myRunnable(){
    mclass = new myTestClass();
}


@Override
public void run(){

    try {
        System.out.println("Putting thread to sleep for 5 seconds");
        Thread.sleep(5000);
    }
    catch(Exception e){
        e.printStackTrace();
    }

    System.out.println("\nTesting MyRunnable on Thread: " + Thread.currentThread().getName());

}

}

Then define the class on which the Runnable depends:

public class myTestClass implements Serializable{

List<Double> list = new ArrayList<Double>(10);

public void myTestClass(){

    for (int i = 0; i < 10; i++)
        list.add( (double) i);
}

}

And create a Hazelcast instance and IExecutorService to execute

public class TestHazelCast {

public static void main(String[] args) {

    Config cfg = new Config();
    HazelcastInstance instance = Hazelcast.newHazelcastInstance(cfg);
    IExecutorService exec = instance.getExecutorService("exec");

    for (int i = 0 ; i < 7; i++){
        exec.execute(new myRunnable());
    }

}

}

Then deploy Hazelcast on a compute node along with appropriate jars using something like:

java -server -Xms1G -Xmx1G -cp "../lib/hazelcast-3.2.2.jar;../lib/AllMyClasses.jar" com.hazelcast.examples.StartServer

-- Do I need to create new Hazelcast instances inside the Runnables/Callables?

Why do you want to do that? If you need to access the HazelcastInstance running the runnable/callable, let it implement HazelcastInstanceAware and you get the HazelcastInstance injcted.

-- What is the effect of creating instances inside Runnables/Callables versus creating them in the master thread?

Don't understand your question. Please elaborate.

-- Will the IExecutorService do all the choosing about what nodes and threads to send off to?

Depends on the call you make. There are different execute methods like execute on that member, execute on subset (potentially all) of members, execute on member owning partition or execute on any member.

So you can leave it fully up to HZ or you can can take full control. Whatever you need.

-- do I have to worry about how the new Hazelcast instances I create will bind to threads and nodes, or does this happen automatically?

I don't know what you mean.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM