简体   繁体   English

如何在 Python 中生成唯一随机浮点数列表

[英]How to generate list of unique random floats in Python

I know that there are easy ways to generate lists of unique random integers (eg random.sample(range(1, 100), 10) ).我知道有一些简单的方法可以生成唯一随机整数列表(例如random.sample(range(1, 100), 10) )。

I wonder whether there is some better way of generating a list of unique random floats, apart from writing a function that acts like a range, but accepts floats like this:我想知道是否有一些更好的方法来生成一个唯一的随机浮点数列表,除了编写一个像范围一样的函数,但接受这样的浮点数:

import random

def float_range(start, stop, step):
    vals = []
    i = 0
    current_val = start
    while current_val < stop:
        vals.append(current_val)
        i += 1
        current_val = start + i * step
    return vals

unique_floats = random.sample(float_range(0, 2, 0.2), 3)

Is there a better way to do this?有一个更好的方法吗?

Answer回答

One easy way is to keep a set of all random values seen so far and reselect if there is a repeat:一种简单的方法是保留一组到目前为止看到的所有随机值,如果有重复则重新选择:

import random

def sample_floats(low, high, k=1):
    """ Return a k-length list of unique random floats
        in the range of low <= x <= high
    """
    result = []
    seen = set()
    for i in range(k):
        x = random.uniform(low, high)
        while x in seen:
            x = random.uniform(low, high)
        seen.add(x)
        result.append(x)
    return result

Notes笔记

  • This technique is how Python's own random.sample() is implemented.这种技术是 Python 自己的random.sample()的实现方式。

  • The function uses a set to track previous selections because searching a set is O(1) while searching a list is O(n).该函数使用集合来跟踪先前的选择,因为搜索集合的复杂度为 O(1),而搜索列表的复杂度为 O(n)。

  • Computing the probability of a duplicate selection is equivalent to the famous Birthday Problem .计算重复选择的概率等同于著名的生日问题

  • Given 2**53 distinct possible values from random() , duplicates are infrequent.random()给定 2**53 个不同的可能值,重复项很少见。 On average, you can expect a duplicate float at about 120,000,000 samples.平均而言,您可以预期重复浮点数约为 120,000,000 个样本。

Variant: Limited float range变体:有限的浮动范围

If the population is limited to just a range of evenly spaced floats, then it is possible to use random.sample() directly.如果总体仅限于一系列均匀分布的浮点数,则可以直接使用random.sample() The only requirement is that the population be a Sequence :唯一的要求是 population 是一个Sequence

from __future__ import division
from collections import Sequence

class FRange(Sequence):
    """ Lazily evaluated floating point range of evenly spaced floats
        (inclusive at both ends)

        >>> list(FRange(low=10, high=20, num_points=5))
        [10.0, 12.5, 15.0, 17.5, 20.0]

    """
    def __init__(self, low, high, num_points):
        self.low = low
        self.high = high
        self.num_points = num_points

    def __len__(self):
        return self.num_points

    def __getitem__(self, index):
        if index < 0:
            index += len(self)
        if index < 0 or index >= len(self):
            raise IndexError('Out of range')
        p = index / (self.num_points - 1)
        return self.low * (1.0 - p) + self.high * p

Here is a example of choosing ten random samples without replacement from a range of 41 evenly spaced floats from 10.0 to 20.0.下面是从 10.0 到 20.0 的 41 个均匀间隔的浮点数范围内选择十个随机样本而不放回的示例。

>>> import random
>>> random.sample(FRange(low=10.0, high=20.0, num_points=41), k=10)
[13.25, 12.0, 15.25, 18.5, 19.75, 12.25, 15.75, 18.75, 13.0, 17.75]

You can easily use your list of integers to generate floats:您可以轻松地使用整数列表生成浮点数:

int_list = random.sample(range(1, 100), 10)
float_list = [x/10 for x in int_list]

Check out this Stack Overflow question about generating random floats.查看关于生成随机浮点数的Stack Overflow 问题

If you want it to work with python2, add this import:如果您希望它与 python2 一起使用,请添加此导入:

from __future__ import division

If you need to guarantee uniqueness, it may be more efficient to如果你需要保证唯一性,它可能更有效

  1. Try and generate n random floats in [lo, hi] at once.尝试一次在[lo, hi]中生成n随机浮点数。
  2. If the length of the unique floats is not n , try and generate however many floats are still needed如果唯一浮点数的长度不是n ,尝试生成仍然需要很多浮点数

and continue accordingly until you have enough, as opposed to generating them 1-by-1 in a Python level loop checking against a set.并相应地继续,直到你有足够的,而不是在 Python 级循环检查集合中逐一生成它们。

If you can afford NumPy doing so with np.random.uniform can be a huge speed-up.如果您负担得起 NumPy ,那么使用np.random.uniform可以大大加快速度。

import numpy as np

def gen_uniq_floats(lo, hi, n):
    out = np.empty(n)
    needed = n
    while needed != 0:
        arr = np.random.uniform(lo, hi, needed)
        uniqs = np.setdiff1d(np.unique(arr), out[:n-needed])
        out[n-needed: n-needed+uniqs.size] = uniqs
        needed -= uniqs.size
    np.random.shuffle(out)
    return out.tolist()

If you cannot use NumPy , it still may be more efficient depending on your data needs to apply the same concept of checking for dupes afterwards, maintaining a set.如果您不能使用 NumPy ,它仍然可能更有效,具体取决于您的数据需要应用相同的概念,即之后检查重复项,维护一个集合。

def no_depend_gen_uniq_floats(lo, hi, n):
    seen = set()
    needed = n
    while needed != 0:
        uniqs = {random.uniform(lo, hi) for _ in range(needed)}
        seen.update(uniqs)
        needed -= len(uniqs)
    return list(seen)

Rough benchmark粗略基准

Extreme degenerate case极端退化的情况

# Mitch's NumPy solution
%timeit gen_uniq_floats(0, 2**-50, 1000)
153 µs ± 3.71 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

# Mitch's Python-only solution
%timeit no_depend_gen_uniq_floats(0, 2**-50, 1000)
495 µs ± 43.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

# Raymond Hettinger's solution (single number generation)
%timeit sample_floats(0, 2**-50, 1000)
618 µs ± 13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

More "normal" case (with larger sample)更“正常”的情况(样本更大)

# Mitch's NumPy solution
%timeit gen_uniq_floats(0, 1, 10**5)
15.6 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

# Mitch's Python-only solution
%timeit no_depend_gen_uniq_floats(0, 1, 10**5)
65.7 ms ± 2.31 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# Raymond Hettinger's solution (single number generation)
%timeit sample_floats(0, 1, 10**5)
78.8 ms ± 4.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

You could just use random.uniform(start, stop) .你可以只使用random.uniform(start, stop) With double precision floats, you can be relatively sure that they are unique if your set is small.使用双精度浮点数,如果您的集合很小,您可以相对确定它们是唯一的。 If you want to generate a big number of random floats and need to avoid that you have a number twice, check before adding them to the list.如果您想生成大量随机浮点数并且需要避免您有一个数字两次,请在将它们添加到列表之前进行检查。

However, if you are looking for a selection of specific numbers, this is not the solution.但是,如果您正在寻找特定数字的选择,这不是解决方案。

min_val=-5
max_val=15

numpy.random.random_sample(15)*(max_val-min_val) + min_val

or use uniform或使用制服

numpy.random.uniform(min_val,max_val,size=15)

As stated in the documentation Python has the random.random() function:如文档中所述,Python 具有 random.random() 函数:

import random
random.random()

Then you will get a float val as: 0.672807098390448然后你会得到一个浮点值:0.672807098390448

So all you need to do is make a for loop and print out random.random():所以你需要做的就是做一个for循环并打印出 random.random():

>>> for i in range(10):
print(random.random())

more_itertools has a generic numeric_range that handles both integers and floats. more_itertools有一个通用的numeric_range可以处理整数和浮点数。

import random

import more_itertools as mit

random.sample(list(mit.numeric_range(0, 2, 0.2)), 3)
# [0.8, 1.0, 0.4]

random.sample(list(mit.numeric_range(10.0, 20.0, 0.25)), 10)
# [17.25, 12.0, 19.75, 14.25, 15.25, 12.75, 14.5, 15.75, 13.5, 18.25]

random.uniform generate float values random.uniform 生成浮点值

import random

def get_random(low,high,length):
  lst = []
  while len(lst) < length:
    lst.append(random.uniform(low,high))
    lst = list(set(lst))
  return lst

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM