[英]Data structure to select number with probability proportional to its value in less than O(n)
I have a set of numbers, [1, 3, 4, 5, 7]
.我有一组数字
[1, 3, 4, 5, 7]
。 I want to select a number from the set with a probability proportional to its value:我想要 select 中的一个数字,其概率与其值成正比:
Number![]() |
Probability![]() |
% ![]() |
---|---|---|
1 ![]() |
1/20 ![]() |
5 ![]() |
3 ![]() |
3/20 ![]() |
15 ![]() |
4 ![]() |
4/20 ![]() |
20 ![]() |
5 ![]() |
5/20 ![]() |
25 ![]() |
7 ![]() |
7/20 ![]() |
35 ![]() |
However, I want to be able to both update and query this set in less than O(n).但是,我希望能够在小于 O(n) 的时间内更新和查询这个集合。 Are there any data structures that would help me achieve that?
是否有任何数据结构可以帮助我实现这一目标?
Preferably in Java, if it exists already最好在Java,如果已经存在
You can get O(log n)
amortized querying and updating (including removing/inserting elements) using a Binary Indexed Tree , also known as a Fenwick tree.您可以使用Binary Indexed Tree (也称为 Fenwick 树)获得
O(log n)
分摊查询和更新(包括删除/插入元素)。 The idea is to use dynamic resizing , which is the same trick used in variable-size arrays and hash tables to get amortized constant time appends.这个想法是使用动态调整大小,这与可变大小 arrays 和 hash 表中使用的相同技巧来获得摊销的常量时间追加。 This also implies that you should be able to get
O(log n)
worst-case bounds using the method from dynamic arrays of rebuilding a second array on the side, but this makes the code significantly longer.这也意味着您应该能够使用动态 arrays 中重建第二个数组的方法获得
O(log n)
最坏情况界限,但这会使代码明显变长。
First, we know that given a list of the partial sums of arr
, and a random integer in [0, sum(arr)]
, we can do this in O(log n)
time with a binary search.首先,我们知道给定
arr
的部分和列表,以及[0, sum(arr)]
中的随机 integer,我们可以使用二分查找在O(log n)
时间内完成此操作。 Specifically, if our random integer is r
, we want the index of the rightmost partial sum less than or equal to r
.具体来说,如果我们的随机 integer 是
r
,我们希望最右边的部分和的索引小于或等于r
。
Now, we'll use the technique from this post of Fenwick trees to maintain and query the partial sums.现在,我们将使用这篇 Fenwick 树帖子中的技术来维护和查询部分和。 That post is slightly different from yours: they have a fixed set of
n
keys, whose weights can be updated, without new insertions or deletions.该帖子与您的帖子略有不同:它们有一组固定的
n
键,可以更新其权重,而无需新的插入或删除。
A Fenwick tree is an array that allows you to answer queries about partial sums of a 'base' array in O(log n)
time per query, and can be built in O(n)
time. Fenwick 树是一个数组,允许您在每次查询的
O(log n)
时间内回答有关“基本”数组的部分和的查询,并且可以在O(n)
时间内构建。 In particular, you can特别是,你可以
arr
less than or equal to r
,arr
小于或等于r
的最右边部分和的索引,arr[i]
to arr[i]+c
for any integer c
,c
的arr[i]
设置为arr[i]+c
, both in O(log n)
time.都在
O(log n)
时间内。
Start by appending n
zeros to arr
(it is now half full), and build its Fenwick tree.首先向
arr
添加n
零(它现在是半满的),然后构建它的 Fenwick 树。 We can treat 'removing' an element as setting its weight to 0. Inserting an element is done by taking the zero after the rightmost nonzero element in arr
as the new element's spot.我们可以将“删除”一个元素视为将其权重设置为 0。插入一个元素是通过将
arr
中最右边的非零元素之后的零作为新元素的位置来完成的。 The removed elements and new elements may eventually cause our array to fill up: if we reach 75% capacity, rebuild our array and Fenwick tree, doubling the array size (pad with zeros on the right) and deleting all the zero-weight elements.删除的元素和新元素最终可能会导致我们的数组填满:如果我们达到 75% 的容量,重建我们的数组和 Fenwick 树,将数组大小加倍(在右侧填充零)并删除所有零权重元素。 If we reach 25% capacity, shrink the array to half size, rebuilding the Fenwick tree as well.
如果我们达到 25% 的容量,将阵列缩小一半,同时重建 Fenwick 树。
You'll need to maintain arr
constantly to be able to rebuild, so all updates must be done on arr
and the Fenwick tree.您需要不断维护
arr
才能重建,因此所有更新都必须在arr
和 Fenwick 树上完成。 You'll also need a hashmap from array indices to your keys for random selection.您还需要一个 hashmap 从数组索引到您的键以进行随机选择。
The good part is that you don't need to modify the Fenwick tree internals at all: given a Fenwick tree implementation in Java that supports initialization, array updates and the binary search, you can treat it as a black box.好处是您根本不需要修改 Fenwick 树的内部结构:给定 Java 中支持初始化、数组更新和二进制搜索的 Fenwick 树实现,您可以将其视为黑盒。 This stops being true if you want worst-case time guarantees: then, you'll need to copy the internal state of the Fenwick tree, piece by piece, which has some complications.
如果你想要最坏情况下的时间保证,这就不再成立了:然后,你需要逐个复制 Fenwick 树的内部 state,这有一些复杂性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.