简体   繁体   English

Python Numpy:循环中的随机数

[英]Python Numpy: Random number in a loop

I have such code and use Jupyter-Notebook 我有这样的代码,并使用Jupyter-Notebook

for j in range(timesteps):    
    a_int = np.random.randint(largest_number/2) # int version

and i get random numbers, but when i try to move part of code to the functions, i start to receive same number in each iteration 我得到随机数,但是当我尝试将部分代码移到函数中时,每次迭代我都开始收到相同的数字

def create_train_data():        
    np.random.seed(seed=int(time.time()))     
    a_int = np.random.randint(largest_number/2) # int version
    return a

for j in range(timesteps):    
    c = create_train_data()  

Why it's happend and how to fix it? 为什么会发生以及如何解决? i think maybe it because of processes in Jupyter-Notebook 我认为可能是因为Jupyter-Notebook中的过程

The offending line of code is 令人反感的代码行是

np.random.seed(seed=int(time.time()))

Since you're executing in a loop that completes fairly quickly, calling int() on the time reduces your random seed to the same number for the entire loop. 由于您执行的循环相当快,因此在该时间调用int()可将整个循环的随机种子数减少为相同的数量。 If you really want to manually set the seed, the following is a more robust approach. 如果您确实要手动设置种子,则下面是一种更可靠的方法。

def create_train_data():   
    a_int = np.random.randint(largest_number/2) # int version
    return a

np.random.seed(seed=int(time.time()))
for j in range(timesteps):
    c = create_train_data()

Note how the seed is being created once and then used for the entire loop, so that every time a random integer is called the seed changes without being reset. 请注意,种子是如何一次创建然后用于整个循环的,因此每次调用一个随机整数时,种子都会更改而不会被重置。

Note that numpy already takes care of a pseudo-random seed. 请注意,numpy已经处理了伪随机种子。 You're not gaining more random results by using it. 使用它不会获得更多随机结果。 A common reason for manually setting the seed is to ensure reproducibility. 手动设置种子的常见原因是要确保可重复性。 You set the seed at the start of your program (top of your notebook) to some fixed integer (I see 42 in a lot of tutorials), and then all the calculations follow from that seed. 您将程序开始时(笔记本顶部)的种子设置为某个固定的整数(我在许多教程中看到42),然后所有计算都从该种子开始。 If somebody wants to verify your results, the stochasticity of the algorithms can't be a confounding factor. 如果有人想验证您的结果,那么算法的随机性就不会成为一个混杂因素。

The other answers are correct in saying that it is because of the seed. 其他答案是正确的,因为它是种子。 If you look at the Documentation From SciPy you will see that seeds are used to create a predictable random sequence. 如果您查看SciPy文档,您会发现种子用于创建可预测的随机序列。 However, I think the following answer from another question regarding seeds gives a better overview of what it does and why/where to use it. 但是,我认为从另一个有关种子的问题得到的以下回答可以更好地概述其作用以及使用原因/用途。 What does numpy.random.seed(0) do? numpy.random.seed(0)有什么作用?

Hans Musgrave's answer is great if you are happy with pseudo-random numbers. 如果您对伪随机数感到满意,汉斯·穆斯格雷夫(Hans Musgrave)的答案很好。 Pseudo-random numbers are good for most applications but they are problematic if used for cryptography. 伪随机数对大多数应用程序都很好,但如果用于密码学,则会出现问题。

The standard approach for getting one truly random number is seeding the random number generator with the system time before pulling the number, like you tried. 获取一个真正的随机数的标准方法是在拉出该数字之前为该随机数生成器植入系统时间,就像您尝试过的那样。 However, as Hans Musgrave pointed out, if you cast the time to int, you get the time in seconds which will most likely be the same throughout the loop. 但是,正如汉斯·马斯格雷夫(Hans Musgrave)所指出的那样,如果将时间转换为整数,则以秒为单位的时间将很可能在整个循环中都是相同的。 The correct solution to seed the RNG with a time is: 随时间播种RNG的正确解决方案是:

def create_train_data():        
    np.random.seed()     
    a_int = np.random.randint(largest_number/2) # int version
    return a

This works because Numpy already uses the computer clock or another source of randomness for the seed if you pass no arguments (or None ) to np.random.seed : 之所以有效,是因为如果您没有向np.random.seed传递任何参数(或None ), np.random.seed Numpy已经使用了计算机时钟或其他随机数作为种子:

Parameters: seed : {None, int, array_like} , optional Random seed used to initialize the pseudo-random number generator. 参数: seed{None, int, array_like} ,可选随机种子,用于初始化伪随机数生成器。 Can be any integer between 0 and 2**32 - 1 inclusive, an array (or other sequence) of such integers, or None (the default). 可以是02**32 - 1含)之间的任何整数,此类整数的数组(或其他序列)或None (默认值)。 If seed is None , then RandomState will try to read data from /dev/urandom (or the Windows analogue) if available or seed from the clock otherwise. 如果seedNone ,则RandomState将尝试从/dev/urandom (或Windows类似物)(如果有)中读取数据,否则从时钟中读取种子。

It all depends on your application though. 但这完全取决于您的应用程序。 Do note the warning in the docs: 请注意文档中的警告:

Warning The pseudo-random generators of this module should not be used for security purposes. 警告请勿出于安全目的使用此模块的伪随机数生成器。 For security or cryptographic uses, see the secrets module. 有关安全性或加密用途,请参阅机密模块。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM