如何在少于O（n）的时间内在std :: set中选择一个随机元素？

Question

This question with an added constraint. 这个问题有一个附加的约束。

I'm willing to allow not-uniform selection as long as it's not to lop sided. 我愿意允许不均匀的选择，只要它不偏不倚。

Given that " sets are typically implemented as binary search trees " and I expect they will contain some kind of depth or size information for balancing, I would expect you could do some sort of weighted random walk of the tree. 鉴于“ 集合通常实现为二叉搜索树 ”，我希望它们包含某种深度或大小信息以进行平衡，我希望你可以对树进行某种加权随机游走。 However I don't know of any remotely portable way to do that. 但是，我不知道有任何远程可移植的方式来做到这一点。

Edit: The constraint is NOT for the amortized time. 编辑：约束不是分摊的时间。

Answer 1

Introduce array with size equal to set. 引入大小等于set的数组。 Make array elements hold addresses of every element in set. 使数组元素保存集合中每个元素的地址。 Generate random integer R bounded by array/set size, pick address in array's element indexed by R and dereference it to obtain set's element. 生成由数组/集大小限制的随机整数R ，在由R索引的数组元素中选择地址并取消引用它以获取集合的元素。

Answer 2

I don't see how to do it with just std::set , so you probably need a different data structure. 我不知道如何使用std::set来完成它，所以你可能需要一个不同的数据结构。 Like Victor Sorokin said, you can combine a set with a vector. 就像Victor Sorokin所说，你可以将一个集合与一个向量组合起来。 Instead of set<T> , use map<T, size_t> , plus vector< map<T, size_t>::iterator > . 而不是set<T> ，使用map<T, size_t> ，加上vector< map<T, size_t>::iterator > 。 The value for each key is an index into the vector, and each element of the vector points back to the map element. 每个键的值是向量的索引，向量的每个元素都指向map元素。 The vector elements have no particular order. 向量元素没有特定的顺序。 When you add an element, put it at the end of the vector. 添加元素时，将其放在向量的末尾。 When you remove an element and it's not the last one in the vector, move the last element to the deleted element's position. 删除元素并且它不是向量中的最后一个元素时，将最后一个元素移动到已删除元素的位置。

Answer 3

IF you know the distribution of the elements in the set, you can randomly select key (with that same distribution) and use std::set::lower_bound . 如果您知道集合中元素的分布，您可以随机选择密钥（具有相同的分布）并使用std::set::lower_bound 。 That's a lot of if though. 如果有的话，那就是很多。

int main() {
    std::set<float> container;
    for(float i=0; i<100; i += .01)  
        container.insert(i);
    //evenish distribution of 10000 floats between 0 and 100.
    float key = std::rand() *10000f / RAND_MAX; //not random, sue me
    std::set<float>::iterator iter = container.lower_bound(key); //log(n)
    std::cout << *iter;
    return 0;
}

Answer 4

For std::unordered_set<int> s : 对于std::unordered_set<int> s ：

1) take random R in min(s)..max(s) 1）在min(s)..max(s) ）中取随机R min(s)..max(s)

2) if R in s : return R 2）如果R在s ：返回R

3) 3）

newIter = s.insert(R).first;
newIter++;
if (newIter == s.end()) {
    newIter = s.begin();
}
auto result = *newIter;
s.erase(R);
return result;

For ordered set (std::set) probability would depend on distance between elements. 对于有序集（std :: set），概率取决于元素之间的距离。 unordered_set is randomized by hash. unordered_set由哈希随机化。

I hope this can help. 我希望这可以提供帮助。

PS converting std::set<V> into std::set<std::pair<int, V>> (where first element in pair is a hash of second) makes this method suitable for any hashable V. PS将std::set<V>转换为std::set<std::pair<int, V>> （其中pair中的第一个元素是第二个散列）使得此方法适用于任何hashable V.

Answer 5

You may be able to make a randomly-ordered copy of the map by using this constructor 您可以使用此构造函数制作随机排序的地图副本

template <class InputIterator>
set(InputIterator f, InputIterator l,
    const key_compare& comp)

..and passing a comparator that compares hashes of the keys (or some other deterministic spreading function.) Then take the "smallest" keys according to this new map. ..并传递一个比较键的哈希值（或其他一些确定性扩散函数）的比较器。然后根据这个新映射取“最小”键。

You could construct the map once and amortize the cost across several requests for a "random" element. 您可以构建一次映射，并在“随机”元素的多个请求中分摊成本。

如何在少于O（n）的时间内在std :: set中选择一个随机元素？

问题描述

5 个解决方案

解决方案1
6 2011-11-28 21:10:09

解决方案2
3 2011-11-29 00:00:04

解决方案3
1 2011-11-28 22:01:39

解决方案4
0 2014-02-11 13:24:00

解决方案5
0 2011-11-28 21:05:40

如何在少于O（n）的时间内在std :: set中选择一个随机元素？

问题描述

5 个解决方案

解决方案1 6 2011-11-28 21:10:09

解决方案2 3 2011-11-29 00:00:04

解决方案3 1 2011-11-28 22:01:39

解决方案4 0 2014-02-11 13:24:00

解决方案5 0 2011-11-28 21:05:40

解决方案1
6 2011-11-28 21:10:09

解决方案2
3 2011-11-29 00:00:04

解决方案3
1 2011-11-28 22:01:39

解决方案4
0 2014-02-11 13:24:00

解决方案5
0 2011-11-28 21:05:40