简体   繁体   English

如何从 stream 中获取随机对象

[英]How to get random objects from a stream

Lets say I have a list of words and i want to create a method which takes the size of the new list as a parameter and returns the new list.假设我有一个单词列表,我想创建一个方法,它将新列表的大小作为参数并返回新列表。 How can i get random words from my original sourceList?我如何从我的原始 sourceList 中获取随机单词?

public List<String> createList(int listSize) {
   Random rand = new Random();
   List<String> wordList = sourceWords.
      stream().
      limit(listSize).
      collect(Collectors.toList()); 

   return wordList;
}

So how and where can I use my Random?那么如何以及在哪里可以使用我的 Random?

I've found a proper solution.我找到了一个合适的解决方案。 Random provides a few methods to return a stream. Random 提供了一些方法来返回流。 For example ints(size) which creates a stream of random integers.例如 ints(size) 创建一个随机整数流。

public List<String> createList(int listSize)
{
   Random rand = new Random();
   List<String> wordList = rand.
      ints(listSize, 0, sourceWords.size()).
      mapToObj(i -> sourceWords.get(i)).
      collect(Collectors.toList());

   return wordList;
}

I think the most elegant way is to have a special collector.我认为最优雅的方式是拥有一个特殊的收藏家。

I am pretty sure the only way you can guarantee that each item has an equal chance of being picked, is to collect, shuffle and re-stream.我很确定你能保证每件物品被挑选的机会均等的唯一方法是收集、洗牌和重新播放。 This can be easily done using built-in Collectors.collectingAndThen(...) helper.这可以使用内置的 Collectors.collectingAndThen(...) 助手轻松完成。

Sorting by a random comparator or using randomized reducer, like suggested on some other answers, will result in very biased randomness.通过随机比较器或使用随机减速器进行排序,就像在其他一些答案中建议的那样,将导致非常有偏见的随机性。

List<String> wordList = sourceWords.stream()
  .collect(Collectors.collectingAndThen(Collectors.toList(), collected -> {
      Collections.shuffle(collected);
      return collected.stream();
  }))
  .limit(listSize)
  .collect(Collectors.toList());

You can move that shuffling collector to a helper function:您可以将该改组收集器移动到辅助函数:

public class CollectorUtils {

    public static <T> Collector<T, ?, Stream<T>> toShuffledStream() {
        return Collectors.collectingAndThen(Collectors.toList(), collected -> {
            Collections.shuffle(collected);
            return collected.stream();
        });
    }

}

I assume that you are looking for a way to nicely integrate with other stream processing functions.我假设您正在寻找一种与其他流处理功能很好地集成的方法。 So following straightforward solution is not what you are looking for :)因此,以下简单的解决方案不是您要寻找的:)

Collections.shuffle(wordList)
return wordList.subList(0, limitSize)

Here's a solution I came up with which seems to differ from all the other ones, so I figured why not add it to the pile.这是我想出的一个解决方案,它似乎与所有其他解决方案不同,所以我想为什么不把它添加到一堆。

Basically it works by using the same kind of trick as one iteration of Collections.shuffle each time you ask for the next element - pick a random element, swap that element with the first one in the list, move the pointer forwards.基本上,它的工作原理是在每次请求下一个元素时使用与Collections.shuffle一次迭代相同的技巧 - 选择一个随机元素,将该元素与列表中的第一个元素交换,向前移动指针。 Could also do it with the pointer counting back from the end.也可以用指针从末尾开始倒数。

The caveat is that it does mutate the list you passed in, but I guess you could just take a copy as the first thing if you didn't like that.需要注意的是,它确实会改变您传入的列表,但我想如果您不喜欢那样,您可以将副本作为第一件事。 We were more interested in reducing redundant copies.我们更感兴趣的是减少冗余副本。

private static <T> Stream<T> randomStream(List<T> list)
{
    int characteristics = Spliterator.SIZED;
    // If you know your list is also unique / immutable / non-null
    //int characteristics = Spliterator.DISTINCT | Spliterator.IMMUTABLE | Spliterator.NONNULL | Spliterator.SIZED;
    Spliterator<T> spliterator = new Spliterators.AbstractSpliterator<T>(list.size(), characteristics)
    {
        private final Random random = new SecureRandom();
        private final int size = list.size();
        private int frontPointer = 0;

        @Override
        public boolean tryAdvance(Consumer<? super T> action)
        {
            if (frontPointer == size)
            {
                return false;
            }

            // Same logic as one iteration of Collections.shuffle, so people talking about it not being
            // fair randomness can take that up with the JDK project.
            int nextIndex = random.nextInt(size - frontPointer) + frontPointer;
            T nextItem = list.get(nextIndex);
            // Technically the value we end up putting into frontPointer
            // is never used again, but using swap anyway, for clarity.
            Collections.swap(list, nextIndex, frontPointer);

            frontPointer++;
            // All items from frontPointer onwards have not yet been chosen.

            action.accept(nextItem);
            return true;
        }
    };

    return StreamSupport.stream(spliterator, false);
}

This is my one line solution:这是我的单行解决方案:

 List<String> st = Arrays.asList("aaaa","bbbb","cccc");
 st.stream().sorted((o1, o2) -> RandomUtils.nextInt(0, 2)-1).findFirst().get();

RandomUtils are from commons lang 3 RandomUtils 来自 commons lang 3

Try something like that:尝试这样的事情:

List<String> getSomeRandom(int size, List<String> sourceList) {
    List<String> copy = new ArrayList<String>(sourceList);
    Collections.shuffle(copy);
    List<String> result = new ArrayList<String>();
    for (int i = 0; i < size; i++) {
        result.add(copy.get(i));
    }

    return result;
}

If you want non repeated items in the result list and your initial list is immutable:如果您想要结果列表中的非重复项并且您的初始列表是不可变的:

  • There isn't a direct way to get it from the current Streams API.没有直接的方法可以从当前的 Streams API 中获取它。
  • It's not possible to use a random Comparator because it's going to break the compare contract.不可能使用随机比较器,因为它会破坏比较契约。

You can try something like:您可以尝试以下操作:

public List<String> getStringList(final List<String> strings, final int size) {
    if (size < 1 || size > strings.size()) {
        throw new IllegalArgumentException("Out of range size.");
    }

    final List<String> stringList = new ArrayList<>(size);

    for (int i = 0; i < size; i++) {
        getRandomString(strings, stringList)
                .ifPresent(stringList::add);
    }

    return stringList;
}

private Optional<String> getRandomString(final List<String> stringList, final List<String> excludeStringList) {
    final List<String> filteredStringList = stringList.stream()
            .filter(c -> !excludeStringList.contains(c))
            .collect(toList());

    if (filteredStringList.isEmpty()) {
        return Optional.empty();
    }

    final int randomIndex = new Random().nextInt(filteredStringList.size());
    return Optional.of(filteredStringList.get(randomIndex));
}

@kozla13 improved version: @kozla13 改进版:

List<String> st = Arrays.asList("aaaa","bbbb","cccc");
st.stream().min((o1, o2) -> o1 == o2 ? 0 : (ThreadLocalRandom.current().nextBoolean() ? -1 : 1)).orElseThrow();
  1. Used java built-in class ThreadLocalRandom使用java内置类ThreadLocalRandom
  2. nextInt generates one from sequence [-1, 0, 1], but return 0 in compare func means equals for the elements and side effect of this - first element (o1) will be always taken in this case. nextInt 从序列 [-1, 0, 1] 生成一个,但在 compare func 中返回 0 意味着等于元素和 this 的副作用 - 在这种情况下将始终采用第一个元素 (o1)。
  3. properly handle object equals case正确处理对象等于大小写

If the source list is generally much larger than the new list, you might gain some efficiencies by using a BitSet to get random indices:如果源列表通常比新列表大得多,您可能会通过使用BitSet获取随机索引来提高效率:

List<String> createList3(int listSize, List<String> sourceList) {
  if (listSize > sourceList.size()) {
    throw new IllegalArgumentException("Not enough words in the source list.");
  }

  List<String> newWords = randomWords(listSize, sourceList);
  Collections.shuffle(newWords); // optional, for random order
  return newWords;
}

private List<String> randomWords(int listSize, List<String> sourceList) {
  int endExclusive = sourceList.size();
  BitSet indices = new BitSet(endExclusive);
  Random rand = new Random();
  while (indices.cardinality() < listSize) {
    indices.set(rand.nextInt(endExclusive));
  }
  
  return indices.stream().mapToObj(i -> sourceList.get(i))
    .collect(Collectors.toList());
}

A stream is probably overkill. stream 可能有点矫枉过正。 Copy the source list so you're not creating side-effects, then give back a sublist of the shuffled copy.复制源列表,这样你就不会产生副作用,然后返回一个随机副本的子列表。

public static List<String> createList(int listSize, List<String> sourceList) {
  if (listSize > sourceList.size()) {
    throw IllegalArgumentException("Not enough words for new list.");
  }
  List<String> copy = new ArrayList<>(sourceList);
  Collections.shuffle(copy);
  return copy.subList(0, listSize);
}

The answer is very simple(with stream):答案很简单(使用流):

List<String> a = src.stream().sorted((o1, o2) -> {
        if (o1.equals(o2)) return 0;
        return (r.nextBoolean()) ? 1 : -1;
    }).limit(10).collect(Collectors.toList());

You can test it:你可以测试一下:

List<String> src = new ArrayList<String>();
for (int i = 0; i < 20; i++) {
    src.add(String.valueOf(i*10));
}
Random r = new Random();
List<String> a = src.stream().sorted((o1, o2) -> {
        if (o1.equals(o2)) return 0;
        return (r.nextBoolean()) ? 1 : -1;
    }).limit(10).collect(Collectors.toList());
System.out.println(a);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM