Possible race condition when enabling multithreading

Question

Suppose I have a slight variant of the cloud balancing problem, in which the Process has not just one weight, but a map of (positive) weights, such as

Map<Long, Long> groupMap = new HashMap<>();

where the the key is specific to my domain and the value is the weight.

On the class Computer (still referring to the cloud balancing example) I have a shadow variable hist which is also a (Hash)Map<Long, Long> , and a custom listener updating hist :

public class HistListener implements VariableListener {
    @Override
    public void beforeVariableChanged(ScoreDirector scoreDirector, Object o) {
        Process p = (Process) o;
        if (p.getComputer() != null) { 

            Computer kc = p.getComputer();

            //update hist Map
            scoreDirector.beforeVariableChanged(kc, "hist");
            for (Map.Entry<Long, Long> entrySet:k.getGroupMap().entrySet()){
                kc.getHist().put(entrySet.getKey(), kc.getHist().get(entrySet.getKey()) - k.getGroupMap().get(entrySet.getKey()));
            }
            scoreDirector.afterVariableChanged(kc, "hist");
        }
    }

and pretty much the same for afterVariableChanged just with reversed sign.

I annotate both Process and Computer as @PlanningEntity and register them in the solverConfig .

There are no constraints, so the solver should be able to assign the computer s to the process es arbitrarily. As a result, I expect hist only to have natural numbers (incl. 0) as values.

When running it with <moveThreadCount>NONE</moveThreadCount> , this is indeed the case:

<"Computer"+computer.id: hist>

Computer0: {0=0, 1=0, 2=20, 3=0, 4=10, 5=20, 6=0, 7=10, 8=10, 9=20}
Computer1: {0=0, 1=10, 2=0, 3=0, 4=10, 5=0, 6=10, 7=0, 8=0, 9=0}
Computer2: {0=0, 1=0, 2=0, 3=0, 4=0, 5=0, 6=0, 7=0, 8=0, 9=0}

When running exactly the same code with <moveThreadCount>AUTO</moveThreadCount> , I partially get negative values in hist :

Computer0: {0=0, 1=-20, 2=30, 3=0, 4=-40, 5=50, 6=-10, 7=30, 8=40, 9=150}
Computer1: {0=0, 1=-40, 2=-20, 3=0, 4=-90, 5=-50, 6=-40, 7=-20, 8=-20, 9=-30}
Computer2: {0=0, 1=80, 2=-20, 3=0, 4=30, 5=-30, 6=50, 7=0, 8=-20, 9=-50}

This discrepancy disappears when I refactor the keys of groupMap on process and those of hist on computer as individual shadow variables.

The trace logs suggest a race condition, where several threads access hist simultaneously. (According the Oracle docs, I only need a synchronizedMap implementation if the map is structurally changed , ie, if keys are added or removed - I'm not doing that.)

The use of a Map as a shadow variable greatly enhances the flexibility of my solution, it would be great if this were supported with multithreading. I know I could probably fix this very simply example with an appropriate ConstraintProvider . My actual problem is much more complex than this and is not amenable to be treated with ConstraintProvider s.

Question: Is it possible to have a Map based structure as a shadow variable in a multithreading context?

If it is not possible, I recommend adding a short note in the docs of optaplanner 8.29.0.Final (the version I'm using).

I had a look at questions regarding List s as PlanningVariable s in optaplanner, but I don't see how these questions relate to mine.

Answer 1

Is it possible to have a Map based structure as a shadow variable in a multithreading context?

Yes, because each move thread in a multithreading context has it's own ScoreDirector and own workingSolution internally. From a shadow variable's point of view and that map, it's single threaded.

What can mess this up?

Bad @PlanningId 's in your dataset so the Move.rebase() operations go wrong. Duplicate IDs or lack of IDs. OptaPlanner detects most of these. Unlikely that this is your problem.
Incomplete planning cloning in your model. That's probably it. This will also cause issues you haven't seen yet in a single threaded context, especially when the last working solution greatly differs from the last best found solution when the termination runs out. FULL_ASSERT should detect those, but they might not occur on every run...

Each move thread has their own workingSolution internally. That's not entirely true. They all have a planning clone from the original. But if the planning clone doesn't clone all of the shadow variable affected data, it's corrupted. In a multithreaded solving context this will cause issues much faster.

Ok, this is getting complex. How do I solve this?

Experiment with adding a @DeepPlanningClone annotation on your Map field. But making a shadow variable already implies deep planning cloning it automatically IIRC. My guess it's keys or values in that map that need to get planning cloned too. Read the planning clone section in the docs.

Possible race condition when enabling multithreading

Question

1 answers

solution1
1 ACCPTED 2023-01-11 09:05:41

Possible race condition when enabling multithreading

Question

1 answers

solution1 1 ACCPTED 2023-01-11 09:05:41

solution1
1 ACCPTED 2023-01-11 09:05:41