简体繁体 English

记录帕累托前沿的最佳数据结构

[英]best data structure to record a Pareto front

原文 2018-03-10 09:44:40 1 1 priority-queue/ red-black-tree/ pareto-chart/ ordered-set

May I ask if someone has already seen or faced the following problem? 请问是否有人已经看到或遇到以下问题？

I need to handle a list of cost/profit values c ₁ /p ₁ , c ₂ /p ₂ , c ₃ /p ₃ ,... that satisfies: 我需要处理满足以下条件的成本/利润值列表：c ₁ / p ₁ ，c ₂ / p ₂ ，c ₃ / p ₃ ...。

c ₁ ≤c ₂ ≤c ₃ ≤c ₄ ... 的C _{_{_{_{1≤C2≤C3≤C4}}}} ...
p ₁ ≤p ₂ ≤p ₃ ≤p ₄ ... 第₁页_{_{_{≤p2≤p3≤p4}}} ...

This is an example: 2/3 , 4/5 , 9/15 , 12/19 例如： 2/3 4/5 9/15 12/19 4/5 9/15 12/19

If one tries to insert 10/14 in above list, the operation is rejected because of the existing cost/profit pair 9/12 : it is never useful to increase the cost ( 9->10 ) and decrease the profit ( 14->12 ). 如果尝试在上面的列表中插入 10/14 ，则由于现有成本/利润对9/12而拒绝了该操作：增加成本（ 9->10 ）和降低利润（ 14->12 ）。 Such lists can arise for instance in (the states of) dynamic programming algorithms for knapsack problems, where the costs can represent weights. 例如，这种清单可能出现在背包问题的动态编程算法（的状态）中，其中成本可以表示权重。

If one inserts 7/20 in above list, this should trigger the deletion of 9/15 and 12/19 . 如果在上面的列表中插入 7/20 ，则应触发 9/15和12/19 的删除 。

I have written a solution using the C++ std::set (often implemented with red-black trees), but I needed to provide a comparison function that eventually become a bit overly complex. 我已经使用C++ std::set （通常用红黑树实现）编写了一个解决方案，但是我需要提供一个比较功能，最终使它变得有点过于复杂。 Also, the insertion in such sets takes logarithmic time and that can easily actually lead to linear time (in terms of non-amortized complexity) for example when an insertion triggers the deletion of all other elements. 同样，在这样的集合中插入将花费对数时间，例如在插入触发所有其他元素的删除时，实际上很容易导致线性时间（就未摊销的复杂性而言）。

I wonder if better solutions exist, given that there are countless solutions to implement (ordered) sets, eg, priority queues, heaps, linked lists, hash tables, etc. 我想知道是否存在更好的解决方案，因为存在无数实现（有序）集合的解决方案，例如优先级队列，堆，链接列表，哈希表等。

This is a Pareto front (obj1: min cost, obj2: max profit) , but I still could not find the best structure to record it. 这是一个帕累托阵线（obj1：最小成本，obj2：最大利润） ，但是我仍然找不到最好的结构来记录它。

1 个解决方案

I did not fully understand the rules you described, so I will agnostically say that an attempt to an insertion might trigger rejection and if it is accepted, then subsequent items need to be removed. 我不完全了解您描述的规则，因此我不可知地说，尝试插入可能会触发拒绝，如果接受，则需要删除后续项。

You will need to use a balanced comparison tree, represented as an array. 您将需要使用一个平衡比较树，以一个数组表示。 In that case, finding the nodes you need will take O(logN) time, which will be the complexity of a search or a rejected insertion attempt. 在这种情况下，找到所需的节点将花费O（logN）时间，这将是搜索或拒绝插入尝试的复杂性。 When you need to remove items, then you remove them and insert a new one, which has a complexity of 当您需要删除项目时，则将其删除并插入新的项目，其复杂性为

O(logN + N + N + logN) (that is, searching, removing, rebalancing and inserting. We could get rid of the last logarithm if while rebalancing we knoe where the new item is to be inserted) O（logN + N + N + logN）（即搜索，删除，重新平衡和插入。如果在重新平衡时我们知道要在哪里插入新项，我们可以摆脱上一个对数）

O(logN + N + N + logN) = O(2logN + 2N) = O(logN^2 + 2N), which is largely a linear complexity. O（logN + N + N + logN）= O（2logN + 2N）= O（logN ^ 2 + 2N），这在很大程度上是线性复杂度。