简体   繁体   English

如何从Java中的HashSet中获取100个随机元素?

[英]How to get 100 random elements from HashSet in Java?

I have a HashSet in which I have 10000 elements. 我有一个HashSet,其中我有10000个元素。 I want to extract random 100 elements from that HashSet. 我想从该HashSet中提取随机的100个元素。 So I thought I can use shuffle on the set but it doesn't work. 所以我认为我可以在套装上使用shuffle,但它不起作用。

Set<String> users = new HashSet<String>();

// for randomness, but this doesn't work
Collections.shuffle(users, new Random(System.nanoTime()));  

// and use for loop to get 100 elements

I cannot use shuffle now, is there any other best way to get 100 random elements from HashSet in Java? 我现在不能使用shuffle,有没有其他最好的方法从Java中获取HashSet中的100个随机元素?

Without building a new list, you can implement the following algorithm: 在不构建新列表的情况下,您可以实现以下算法:

n = 100
d = 10000  # length(users)
for user in users:
    generate a random number p between 0 and 1
    if p <= n / d:
       select user
       n -= 1
    d -= 1

As you iterate through the list, you decrease the probability of future elements from being chosen by decreasing n, but at the the same time increase the probability by decreasing d. 当您遍历列表时,通过减少n来减少选择未来元素的概率,但同时通过减少d来增加概率。 Initially, you would have a 100/10000 chance of choosing the first element. 最初,您有100/10000的机会选择第一个元素。 If you decide to take that element, you would have a 99/9999 chance of choosing the second element; 如果您决定采用该元素,您将有99/9999的机会选择第二个元素; if you don't take the first one, you'll have a slightly better 100/9999 chance of picking the second element. 如果你拿第一个,你将有更好的100/9999选择第二个元素的机会。 The math works out so that in the end, every element has a 100/10000 chance of being selected for the output. 数学计算结果是,最终,每个元素都有100/10000的机会被选中作为输出。

Shuffling the collection implies that there is some defined order of elements within, so elements can be reordered. 对集合进行混洗意味着内部存在一些已定义的元素顺序,因此可以对元素进行重新排序。 HashSet is not an ordered collection as there is no order of elements inside (or rather details of the ordering are not exposed to the user). HashSet不是有序集合,因为内部没有元素的顺序(或者更确切地说,订购的细节不会暴露给用户)。 Therefore implementation wise it's does not makes much sense to shuffle HashSet . 因此实现明智的是,对HashSet进行洗牌没有多大意义。

What you can do is add all elements from your set to the ArrayList , shuffle it and get your results. 你可以做的是将你的set中的所有元素添加到ArrayList ,随机播放并获得结果。

List<String> usersList = new ArrayList<String>(users);
Collections.shuffle(usersList);
// get 100 elements out of the list

The java.lang.HashSet has an order so you can't shuffle Sets. java.lang.HashSet有一个顺序,所以你不能随机播放集合。 If you must use Sets you might iterate over the Set and stop on a random position. 如果必须使用集合,则可以迭代集合并在随机位置停止。

Pseudocode: 伪代码:

Set randomUsers = new HashSet<String>();
Random r = new Random();
Iterator it = users.iterator(); 
numUsersNeeded = 100;
numUsersLeft = users.size();
while (it.hasNext() && randomUsers.size() < 100) {
  String user = it.next();
  double prop = (double)numUsersNeeded / numUsersLeft;
  --numUsersLeft;
  if (prop > r.nextDouble() && randomUsers.add(user)) { 
    --numUsersNeeded;
  }
}

You might repeat this because there is no garantiy that you fetch 100 elements. 你可能会重复这一点,因为没有你可以获取100个元素的garantiy。

If memory is no issue you can create an array and pick 100 random elements: 如果内存没有问题,你可以创建一个数组并选择100个随机元素:

Pseudocode II: 伪代码II:

Object userArray[] = user.toArray();
Set<String> randoms = new HashSet<String>();
while(randoms.size() != 100) {
  int randomUser = userArray[new Random().nexInt(10000)];
  randoms.add(randomUser);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM