How to use external data in the objective function in Python PuLP?

Question

I want to select a square fixed-size subset of a square matrix, such that the sum of the subset matrix is minimised. Some code:

import nump as np
import pulp

def subset_matrix(data, inds):
    return data[np.ix_(inds, inds)]

A = np.random.random((10, 10))

indices = list(range(len(A)))

prob = pulp.LpProblem("Minimum subset", pulp.LpMaximize)

x = pulp.LpVariable.dicts('elem', indices, lowBound=0, upBound=1, cat=pulp.LpInteger)

prob += pulp.lpSum(subset_matrix(A, [x[i] for i in indices]))

prob.solve()

This fails because numpy indexing doesn't like the inds being a list of LpVariables . Is there a way around this? How can I make pulps constraints contain a numpy array look-up?

Answer 1

I don't think this is a PuLP question, as much as a problem of how to properly formulate a math problem as a mixed-integer linear program.

It looks like you're trying to express your objective as a sum of coefficients ("the sum of the subset matrix"), over a set of indices to be optimized. (By the way, I don't see where the size constraint on the submatrix is written.) But an MILP requires that the objective be the dot product of a vector of decision variables with a vector of cost coefficients, over a predetermined index set. So in the natural formulation, the decision vector will represent which indices from the full index set you select to be in your submatrix, using binary values.

If I understand what you're trying to do, this seems like a neat problem. I believe you're trying to select a fixed-size subset of indices I \\subset {0, 1, ..., N-1}, such that sum{A(i,j): i,j both in I} is maximized. Suppose for example the big matrix is 10x10, and you want a 6x6 submatrix. So I is some six elements of {0, ..., 9}.

Then I would define variables x(i,j) for i, j both in {0, ..., 9}, equal to one for each element of the big matrix that is selected for the submatrix (zero otherwise), and variables y(i), i in {0, ..., 9}, for the indices that are selected. Then I would try to express those constraints as linear, and make the y variables binary to express that each index is in or out.

Here is a formulation of what I think you mean:

import pulp as pp
import numpy as np
import itertools

#####################
#  Problem Data:    #
#####################

full_matrix_size = 10
submatrix_size = 6

A = np.random.random((full_matrix_size, full_matrix_size)).round(2)

inds = range(full_matrix_size)
product_inds = list(itertools.product(inds,inds))

#####################
#  Variables:       #
#####################

# x[(i,j)] = 1 if the (i,j)th element of the data matrix is in the submatrix, 0 otherwise.
x = pp.LpVariable.dicts('x', product_inds, cat='Continuous', lowBound=0, upBound=1)

# y[i] = 1 if i is in the selected index set, 0 otherwise.
y = pp.LpVariable.dicts('y', inds, cat='Binary')

prob = pp.LpProblem("submatrix_problem", pp.LpMaximize)

#####################
#  Constraints:     #
#####################

# The following constraints express the required submatrix shape:
for (i,j) in product_inds:
    # x[(i,j)] must be 1 if y[i] and y[j] are both in the selected index set.
    prob += pp.LpConstraint(e=x[(i,j)] - y[i] - y[j], sense=1, rhs=-1,
                            name="true_if_both_%s_%s" % (i,j))

    # x[(i,j)] must be 0 if y[i] is not in the selected index set.
    prob += pp.LpConstraint(e=x[(i,j)] - y[i], sense=-1, rhs=0,
                            name="false_if_not_row_%s_%s" % (i,j))

    # x[(i,j)] must be 0 if y[j] is not in the selected index set.
    prob += pp.LpConstraint(e=x[(i,j)] - y[j], sense=-1, rhs=0,
                            name="false_if_not_col_%s_%s" % (i,j))

# The number of selected indices must be what we require:    
prob += pp.LpConstraint(e=pp.LpAffineExpression([(y[i],1) for i in inds]), sense=0,
                        rhs=submatrix_size, name="submatrix_size")

#####################
#  Objective:       #
#####################

prob += pp.LpAffineExpression([(x[pair], A[pair]) for pair in product_inds])

print(prob)

########################
#  Create the problem: #
########################

prob.writeLP("max_sum_submatrix.lp")
prob.solve()

########################## 
#  Display the solution: #
##########################
print("The following indices were selected:")
print([v.name for v in prob.variables() if v.name[0]=='y' and  v.varValue==1])
print("Objective value is " + str(pp.value(prob.objective)))

I'm guessing that was a take-home exam problem.... At least the semester is over now.

How to use external data in the objective function in Python PuLP?

Question

1 answers

solution1
2 2017-07-02 00:19:59

How to use external data in the objective function in Python PuLP?

Question

1 answers

solution1 2 2017-07-02 00:19:59

solution1
2 2017-07-02 00:19:59