简体   繁体   中英

Python sparse matrix in Cplex?

I am working on a large quadratic programming problem. I would like to feed in the Q matrix defining the objective function into IBM's Cplex using Python API. The Q matrix is built using scipy lil matrix because it is sparse. Ideally, I would like to pass the matrix onto Cplex. Does Cplex accept scipy lil matrix?

I can convert the Q to the format of list of lists which Cplex accepts, lets call it qMat. But the size of qMat becomes too large and the machine runs out of memory (even with 120 Gig).

Below is my work in progress code. In the actual problem n is around half a million, and m is around 5 million. In the actual problem Q is given and not randomly assigned as in the problem below.

from __future__ import division
import numpy as np
import cplex
import sys
import random
from scipy import sparse

n = 10
m = 5

def create():
    Q = sparse.lil_matrix((n, n))
    nums = random.sample(range(0, n), m)
    for i in nums:
        for j in nums:
            a = random.uniform(0,1)
            Q[i,j] = a
            Q[j,i] = a
    return Q

def convert(Q):
    qMat = [[[], []] for _ in range(n)]
    for k in xrange(n-1):
        qMat[k][0] = Q.rows[k]
        qMat[k][1] = Q.data[k]
    return qMat

Q = create()
qMat = convert(Q)
my_prob = cplex.Cplex()
my_prob.objective.set_quadratic(qMat)

If n = 500000 and m = 5000000 , then that is 2.5e12 non-zeroes. For each of these you'd need roughly one double for the non-zero value and one CPXDIM for the index. That is 8+4=12 bytes per non-zero. This would give:

>>> print(2.5e12 * 12 / 1024. / 1024. / 1024.)
27939.6772385

Roughly, 28TB of memory! It's not clear exactly how many non-zeros you plan on having, but using this calculation you can easily find out whether it is even possible or not to do what you're asking.

As mentioned in the comments, the CPLEX Python API does not accept scipy lil matrices. You could trydocplex , which is numpy friendly, or you could even try generating an LP file directly.

Using something like the following is probably your best bet in terms of reducing the conversion overhead (I think I made an off-by-one error in the comments section above):

my_prob.objective.set_quadratic(list(zip(Q.rows, Q.data)))

or

my_prob.objective.set_quadratic([[row, data] for row, data in zip(Q.rows, Q.data)]))

At any rate, you should play with these to see what gives the best performance (in terms of speed and memory).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM