寻求资源分配算法

Question

For a project of mine (no, not a homework or exam problem, though I think it would make a good one), I am in need of an algorithm. 对于我的一个项目（不，不是一项家庭作业或考试问题，尽管我认为这会做得很好），但我需要一种算法。 The problem seems familiar and general enough that I'm pretty confident it has been solved in the literature, but I do not have my algorithms books handy and it is not clear what terms would be used to describe it, so googling is of limited use. 这个问题看起来足够熟悉和笼统，以至于我很确信它已经在文献中得到了解决，但是我手头上没有算法书，也不清楚用什么术语来描述它，因此使用谷歌搜索是有限的。

Stripped of extraneous detail, the problem is as follows: You are given a set of resources { R_1, R_2, ... R_n} and a set of tasks {T_1, T_2, ... T_m}. 除去多余的细节，问题如下：为您提供了一组资源{R_1，R_2，... R_n}和一组任务{T_1，T_2，... T_m}。 Each task can be accomplished using any one of alternative sets of resources TR_m = { { R_1m1, R_1m2, ... }, { R2m1, R_2m2, ... }, ... }. 可以使用资源替代集TR_m = {{R_1m1，R_1m2，...}，{R2m1，R_2m2，...}，...}中的任何一个来完成每个任务。 Each resource can only be used by one task at a time. 每个资源一次只能由一个任务使用。 The problem is to see if all tasks can be fulfilled at the same time or, if that is not possible, what the largest number of tasks (starting at T_1) can be fulfilled simultaneously. 问题在于查看是否可以同时完成所有任务，或者，如果不可能，则可以同时完成最大数量的任务（从T_1开始）。

A naive algorithm which just assigns each task the first set of available resources is bound to fail unnecessarily sometimes: Think of TR_1 = { { R_1, R_2 }, { R_1 } } and TR_2 = { { R_1 }, { R_2 } }. 仅仅为每个任务分配第一组可用资源的幼稚算法有时会不必要地失败：考虑TR_1 = {{R_1，R_2}，{R_1}}和TR_2 = {{R_1}，{R_2}}。 T_1 would grab R_1 and R_2 and T_2 would fail, while TR_1 could have just taken R_1 and TR_2 could have taken R_2. T_1会抓住R_1，R_2和T_2会失败，而TR_1可能刚刚抓住R_1，TR_2可能已经抓住R_2。

I am looking for an algorithm, preferably elegant and simple, that would do a better job. 我正在寻找一种算法，最好是优雅且简单的算法，它将做得更好。

In so far as it matters, the resources largely consist out of interchangeable subsets and tasks usually require just one or more resources from each set, so the naive algorithm usually succeeds, but that will not always be case. 就其重要性而言，资源主要由可互换的子集组成，并且任务通常只需要每个集合中的一个或多个资源，因此幼稚算法通常会成功，但并非总是如此。

Moreover, there are usually less than a dozen tasks and resources and the problem (coded currently in Python 3) is not real-time, so brute-force would generally be an acceptable solution, but I am looking for something better. 而且，通常任务和资源少于十个，并且问题（当前在Python 3中编码）不是实时的，因此蛮力通常是可以接受的解决方案，但我正在寻找更好的解决方案。

Any suggestions or links? 有什么建议或链接吗？

Answer 1

Suppose all the tasks are identical. 假设所有任务都相同。

Then your problem is equivalent to the known NP-complete maximum set packing problem . 那么您的问题等同于已知的NP-完全最大集包装问题。

So your problem is certainly NP-hard and you are unlikely to find a perfect algorithm for this. 因此，您的问题当然是NP难题，您不太可能为此找到完美的算法。

Answer 2

Use could use Branch and Bound . 使用可以使用Branch和Bound 。

You'd branch on "for tasks i , which set do I pick?", picking the biggest set first to cause failure as high up in the tree in as possible to save work. 您可能会分支到“对于任务i ，我该选择哪个集合？”，首先选择最大的集合以使树中的故障尽可能多地发生，以节省工作量。 For the initial solution, you can flip that around to find a reasonable (but not optimal) solution quickly, which ends up saving work by pruning more in the real search later. 对于初始解决方案，您可以将其翻转以快速找到一个合理的（但不是最佳的）解决方案，最终可以通过稍后在实际搜索中进行更多的修剪来节省工作。

You could also branch on the s[q,t] of the following model which is closest to 0.5 (in a way, the choice which it is "least sure about"). 您还可以在以下模型的s[q,t]上分支，该模型最接近0.5（以某种方式“最不肯定”的选择）。

The bound could be based on the linear relaxation of this ILP model: 边界可以基于此ILP模型的线性松弛：

maximize sum of x[t] over all t

variables:
0 <= x[t] <= 1  ; x[t] decides whether task t is scheduled or not
0 <= r[i,t] <= 1 ; r[i,t] decides whether task t uses resource i
0 <= s[q,t] <= 1 ; s[q,t] decides whether set q of task t is used

constraints:
1. for all tasks t: (sum s[q,t] over all valid q) - x[t] = 0
2. for all sets s[q,t]: (sum r[i,t] over all i in the set) - size_of_set * s[q,t] >= 0
3. for all resources i: sum r[i,t] over all t <= 1

forces exactly 1 set of resources to be associated with any task that is chosen. 强制恰好有一组资源与所选任务相关联。
forces the resources used by choosing set q for task t to be used by task t (>= 0 because sets may overlap) 强制通过选择任务t的集合q使用的资源由任务t使用（> = 0，因为集合可能重叠）
forces all resources to be used no more than once. 强制不超过一次使用所有资源。

I may have made mistakes in the model, and I'm not sure how good it is. 我可能在模型中犯了错误，但不确定其性能如何。 Anyway, solve it with linear programming (no doubt there's a library for it for Python) and then do a couple of Gomory cuts for good measure (they may look scary, but they're actually pretty simple to program), not too many though, trying to get an all-integer solution with only Gomory cuts often converges very slowly. 无论如何，可以使用线性编程来解决它（毫无疑问，有一个适用于Python的库），然后进行几次Gomory切割以取得良好的效果（它们看起来很吓人，但它们实际上很简单），虽然不是很多，仅使用Gomory割线来尝试获得全整数的解决方案通常会非常缓慢地收敛。 Doing some of them is a cheap way to improve a solution. 做其中一些是改进解决方案的廉价方法。

This will give you an estimate that will let you prune some of the search space. 这将为您提供一个估计值，使您可以修剪一些搜索空间。 How much it actually prunes depends on how close it gets to the actual solution. 它实际修剪的量取决于它与实际解决方案的距离。 I predict that it will tend to select several sets belonging to a task with some factor between 0 and 1, because selecting a set "only a bit" allows it to use the same resource for multiple tasks. 我预测它将倾向于选择一个因数介于0和1之间的任务的多个集合，因为选择一个“仅一点”的集合就可以将相同的资源用于多个任务。 It has to pick several sets then because it must use a total of 1 set per task, but that also means it has more choice of resource so it can do that. 然后，它必须选择多个集合，因为每个任务必须总共使用1个集合，但这也意味着它拥有更多的资源选择，因此可以做到这一点。 Linear Programming is sneaky that way, always trying to give you, in a sense, the most annoying answer :) 从某种意义上说，线性编程是一种偷偷摸摸的方法，从某种意义上说，它总是想给你最讨厌的答案

Of course in this model, you'd exclude any possibilities that are no longer possible (allocated resources and the sets that contain them and tasks that would have zero possible sets), and skip tasks that are already scheduled. 当然，在此模型中，您将排除不再可能的任何可能性（已分配的资源和包含这些资源的集合以及可能具有零个可能集合的任务），并跳过已安排的任务。

If this is too complicated, you can calculate a much simpler bound like this: for all tasks t , take the size of their smallest set s[t] . 如果这太复杂，则可以这样计算出一个更简单的界限：对于所有任务t ，取其最小集合s[t]的大小。 Check how many you can take until the total size is larger than the number of unallocated resources (so take the smallest, add the next smallest, and so on - sort them or use a min-heap). 检查直到总大小大于未分配资源的数量为止可以使用的数量（因此，请选择最小的资源，再添加下一个最小的资源，依此类推-对它们进行排序或使用最小堆）。

Of course if with the resources allocated so far, so many tasks are now without any possible sets that in total (including the ones that were already scheduled) you can not get more than the best solution so far, you can give up on the current branch of the recursion tree. 当然，如果到目前为止已经分配了资源，那么那么多任务现在都没有任何可能的总和（包括已经安排好的任务），那么到目前为止您将无法获得最好的解决方案，而您可以放弃当前的最佳解决方案。递归树的分支。

For the initial solution, you could try using the greedy algorithm you described but with one change: take the smallest set that only contain unallocated resources. 对于最初的解决方案，您可以尝试使用所描述的贪婪算法，但要进行一次更改：采用仅包含未分配资源的最小集合。 That way it tries to "keep out of the way" of further tasks, though obviously you can construct cases where this is worse than picking the first possible set. 通过这种方式，它试图“避开”其他任务，尽管显然您可以构造出比选择第一个可能的情况差的情况。

edit: and of course if there is a set in the collection of sets of a task that is a superset of an other set in that collection, just delete it. 编辑：当然，如果某个任务的集合中有一个集合是该集合中另一个集合的超集，则只需删除它即可。 It can't be any better to use that superset. 使用该超集再好不过了。 That happens to fix the example given in OP, but it typically it wouldn't. 碰巧可以修复OP中给出的示例，但通常不会。

寻求资源分配算法

问题描述

2 个解决方案

解决方案1
1 2014-11-01 13:09:11

解决方案2
1 已采纳 2014-11-01 14:16:15

寻求资源分配算法

问题描述

2 个解决方案

解决方案1 1 2014-11-01 13:09:11

解决方案2 1 已采纳 2014-11-01 14:16:15

解决方案1
1 2014-11-01 13:09:11

解决方案2
1 已采纳 2014-11-01 14:16:15