简体   繁体   中英

Combinatorics algorithm parallelization

I'm writing the program which is calculates C(n, k) combinations and have big difference between n and k (eg n=39, k=13 -> 8122425444 combinations). Also, I need to make some calculations with every combination in realtime. The question is how can I divide my algorithm to several threads to make it faster?

public void getCombinations(List<Item> items) {
    int n = items.size();
    int k = 13;
    int[] res = new int[k];
    for (int i = 1; i <= k; i++) {
        res[i - 1] = i;
    }
    int p = k;
    while (p >= 1) {
        //here I make a Set from items in List by ids in res[]
        Set<Item> cards = convert(res, items);
        //some calculations
        if (res[k - 1] == n) {
            p--;
        } else {
            p = k;
        }
        if (p >= 1) {
            for (int i = k; i >= p; i--) {
                res[i - 1] = res[p - 1] + i - p + 1;
            }
        }
    }
}

private Set<Item> convert(int[] res, List<Item> items) {
    Set<Item> set = new TreeSet<Item>();
    for (int i : res) {
        set.add(items.get(i - 1));
    }
    return set;
}

If you're using JDK 7 then you could use fork/join to divide and conquer this algorithm.

If you want to keep things simple then I would just get a thread to compute a subset of the input and use a CountDownLatch until all threads have completed. The number of threads depends on your CPU.

You could also use Hadoop's map/reduce if you think the input will grow so you can compute on several computers. You will need to normalise it as a map/reduce operation - but look at examples.

I have been working on some code that works with combinatoric sets of this size. Here are a few suggestions for getting output in a reasonable amount of time.

  • Instead of building a list of combinations and then processing them, write your program to take a rank for a combination. You can safely assign signed 64 bit long values to each combination for all k values up to n = 66. This will let you easily break up the number system and assign it to different threads/hardware.
  • If your computation is simple, you should look at using OpenCL or CUDA to do the work. There are a couple of options for doing this. Rootbeer and Aparapi are options for staying in Java and letting a library take care of the GPU details. JavaCL is a nice binding to OpenCL, if you do not mind writing kernels directly in C99. AWS has GPU instance for doing this type of work.
  • If you are going to collect a result for each combination, you are really going to need to consider storage space. For your example of C(39,13), you would need a little under 61 Gigs just to store a long for each combination. You need a good strategy for dealing with datasets of this size.
    • If you are trying to roll up this data into a simple result for the entire set of combinations, then follow @algolicious' suggestion and look at map/reduce to solve this problem.
    • If you really need answers for each combination, but a little error is OK, you may want to look at using AI algorithms or a linear solver to compress the data. Be aware that these techniques will only work if there is something to learn in the resulting data.
    • If some error will not work, but you need every answer, you may want to just consider recomputing it each time you need it, based on the element's rank.

The simplest way to split combinations is to have combinations of combinations. ;)

For each possible "first" value you can create a new task in a thread pool. Or you can create each possible pair of "first" and "second" in as a new task. or three etc. You only need to create as many tasks as you have cpus, so you don't need to go over board.

eg say you want to create all possible selections of 13 from 39 items.

for(Item item: items) {
   List<Item> items2 = new ArrayList<Item>(items);
   items2.remove(item);
   // create a task which considers all selections of 12 from 38 (plus item)
   createCombinationsOf(item, item2, 12);
}

This creates roughly equal work for 39 cpus which may be more than enough. If you want more create pairs (39*38/2) of those.

Your question is quite vague.

What problem are you having right now? Implementing the divide and conquer part of the algorithm (threading, joining, etc), or figuring out how to divide a problem into it's sub-parts.

The later should be your first step. Do you know how to break your original problem into several smaller problems (that can then be dispatched to Executor threads or a similar mechanism to be processed), and how to join the results?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM