Repeating items when implementing a solution for a similar situation to the classic 0-1 knapsack

Question

This problem is largely the same as a classic 0-1 knapsack problem, but with some minor rule changes and a large dataset to play with.

Dataset (product ID, price, length, width, height, weight): (20,000 rows)

Problem :

A company is closing in fast on delivering its 1 millionth order. The marketing team decides to give the customer who makes that order a prize as a gesture of appreciation. The prize is: the lucky customer gets a delivery tote and 1 hour in the warehouse. Use the hour to fill up the tote with any products you desire and take them home for free.

Rules :

1 of each item
Combined volume < tote capacity (45 * 30 * 25 = 47250)
Item must fit individually (Dimensions are such that it can fit into the tote, eg 45 * 45 * 1 wouldn't fit)
Maximize value of combined products
Minimize weight on draws

Solution (using dynamic programming):

from functools import reduce

# The main solver function
def Solver(myItems, myCapacity):

    dp = {myCapacity: (0, (), 0)}
    getKeys = dp.keys

    for i in range(len(myItems)):
        itemID, itemValue, itemVolume, itemWeight = myItems[i]
        for oldVolume in list(getKeys()):

            newVolume = oldVolume - itemVolume

            if newVolume >= 0:
                myValue, ListOfItems, myWeight = dp[oldVolume]
                node = (myValue + itemValue, ListOfItems + (itemID,), myWeight + itemWeight)
                if newVolume not in dp:
                    dp[newVolume] = node
                else:
                    currentValue, loi, currentWeight = dp[newVolume]
                    if currentValue < node[0] or (currentValue == node[0] and node[-1] < currentWeight):
                        dp[newVolume] = node

    return max(dp.values())

# Generate the product of all elements within a given list
def List_Multiply(myList):
    return reduce(lambda x, y: x * y, myList)

toteDims = [30, 35, 45]  
totalVolume = List_Multiply(toteDims)
productsList = []  

with open('products.csv', 'r') as myFile:
    for myLine in myFile:
        myData = [int(x) for x in myLine.strip().split(',')]
        itemDims = [myDim for myDim, maxDim in zip(sorted(myData[2:5]), toteDims) if myDim <= maxDim]
        if len(itemDims) == 3:
            productsList.append((myData[0], myData[1], List_Multiply(myData[2:5]), myData[5]))

print(Solver(productsList, totalVolume))

Issue :

The output is giving repeated items ie. (14018, (26, 40, 62, 64, 121, 121, 121, 152, 152), 13869)

How can I correct this to make it choose only 1 of each item?

Answer 1

It seems that the reason your code may produce answers with duplicate items is that in the inner loop, when you iterate over all generated volumes so far, it is possible for the code to have replaced the solution for an existing volume value before we get there.

Eg if your productsList contained the following

productsList = [
    # id, value, volume, weight
    [1, 1, 2, 1],
    [2, 1, 3, 2],
    [3, 3, 5, 1]
]

and

totalVolume = 10

then by the time you got to the third item, dp.keys() would contain:

10, 8, 7, 5

The order of iteration is not guaranteed, but for the sake of this example, let's assume it is as given above. Then dp[5] would be replaced by a new solution containing item #3, and later in the iteration, we would be using that as a base for a new, even better solution (except now with a duplicate item).

To overcome the above problem, you could sort the keys before the iteration (in ascending order, which is the default), like for oldVolume in sorted(getKeys()) . Assuming all items have a non-negative volume, this should guarantee that we never replace a solution in dp before we have iterated over it.

Another possible problem I see above is the way we get the optimal solution at the end using max(dp.values()) . In the problem statement, it says that we want to minimize weight in the case of a draw. If I'm reading the code correctly, the elements of the tuple are value , list of items , weight in that order, so below we're tied for value, but the latter choice would be preferable because of the smaller weight... however max returns the first one:

>>> max([(4, (2, 3), 3), (4, (1, 3), 2)])
(4, (2, 3), 3)

It's possible to specify the sorting key to max so something like this might work:

>>> max([(4, (2, 3), 3), (4, (1, 3), 2)], key=lambda x: (x[0], -x[-1]))
(4, (1, 3), 2)

Repeating items when implementing a solution for a similar situation to the classic 0-1 knapsack

Question

1 answers

solution1
1 ACCPTED 2017-04-10 20:39:50

Repeating items when implementing a solution for a similar situation to the classic 0-1 knapsack

Question

1 answers

solution1 1 ACCPTED 2017-04-10 20:39:50

solution1
1 ACCPTED 2017-04-10 20:39:50