Python PuLP Optimization - How to improve performance?

Question

I'm writing a script to help me schedule shifts for a team. I have the model and the script to solve it, but when I try to run it, it takes too long. I think it might be an issue with the size of the problem I'm trying to solve. Before getting into the details, here are some characteristics of the problem:

Team has 13 members.
Schedule shifts for a month: 18 operating hours/day, 7 days/week, for 4 weeks.
I want to schedule non-fixed shifts: members can be on shift during any of the 18 operating hours. They can even be on for a couple hours, off for another hour, and then on again, etc. This is subject to some constraints.

Sets of the problem:

m: each of the members
h: operating hours (0 to 17)
d: days of the week (0 to 6)
w: week of the month (0 to 4)

Variables:

X[h,h,d,w]: Binary. 1 if member m starts a shift on hour h,d,w. 0 otherwise.
Y[h,h,d,w]: Binary. 1 if member m is on shift on hour h,d,w. 0 otherwise.
W[m,d,w]: Binary. 1 if member m had any shift on day d,w. 0 otherwise.
G[m,w]: Binary. 1 if member m had the weekend off during week w. 0 otherwise.

The problem has 20 constraints, 13 which are "real constraints" of the problem and 7 which are relations between the variables, for the model to work.

When I run it, I get this message:

At line 2 NAME          MODEL
At line 3 ROWS
At line 54964 COLUMNS
At line 295123 RHS
At line 350083 BOUNDS
At line 369067 ENDATA
Problem MODEL has 54959 rows, 18983 columns and 201097 elements
Coin0008I MODEL read with 0 errors

I left it running overnight and didn't even get a solution. Then I tried changing all variables to continuous variables and it took ~25 seconds to find the optimal solution.

I don't know if I'm having issues because of the size of the problem, or if it's because I'm using only binary variables. Or a combination of that. Are there any general tips or best practices that could be used to improve performance on a model? Or is it always related to the specific model I'm trying to solve?

Answer 1

The long duration solve is almost certainly due to the number of binary variables in your model, and it may be degenerate, meaning that there are many solutions that produce the same value for the objective, so it is just thrashing around trying all of the (relatively equal) combinations.

The fact that it does solve when you relax the integer constraint is good and good troubleshooting. Assuming the model is constructed properly, here are a couple things to try:

Solve for 1 week at a time (4 separate solves). It isn't clear what the linkage is from week-to-week from your description, but that would reduce the model to 1/4 of its size.
Change your time-blocks to more than 1 hour. If you used 3 hour blocks, your problem would again reduce to 1/3 of its size by your description. You would only need to reduce the set H to {1,2,3,4,5,6} and then do the mental math to align that with the actual hours. Or you could do 2 hour blocks.
You should also tinker with the optimality gap. Sometimes the difference between 95% Optimal and 100% is days/weeks/years of solver time. Have you tried a .02, .05, 0.10, or 0.25 relative gap? You may still get the optimal answer, but you forgo the guarantee of optimality.

Python PuLP Optimization - How to improve performance?

Question

1 answers

solution1
0 2022-07-20 17:16:06

Python PuLP Optimization - How to improve performance?

Question

1 answers

solution1 0 2022-07-20 17:16:06

solution1
0 2022-07-20 17:16:06