简体   繁体   English

为什么我使用不同的算法得到不同的引导结果?

[英]Why am I getting different bootstrap results using different algorithms?

I am using two different methods of trying to generate a bootstrap sample我正在使用两种不同的方法来尝试生成引导样本

np.random.seed(335)
y=np.random.normal(0,1,5)
b=np.empty(len(y)) #initializes an empty vector
for j in range(len(y)):
    a = np.random.randint(1,len(y)) #Draws a random integer from 1 to n, where n is our sample size
    b[j] = y[a-1] #indicies in python start at zero, the worst part of Python in my opinion
c = np.random.choice(y, size=5)
print(b)
print(c)

and for my output I get different results对于我的 output 我得到不同的结果

[1.04749432 1.71963433 1.71963433 1.71963433 1.71963433]
[-0.25224454 -0.25224454  0.46604474  1.71963433  0.46604474]

I think the answer has something to do with the random number generator, but I'm confused as to the exact reason.我认为答案与随机数生成器有关,但我对确切原因感到困惑。

This comes down to the use of different algorithms for randomized selection.这归结为使用不同的算法进行随机选择。 There are numerous equivalent ways to select items at random with replacement using a pseudorandom generator (or to generate random variates from any other distribution). select 项目有许多等效方法,随机替换使用伪随机生成器(或从任何其他分布生成随机变量)。 In particular, the algorithm for numpy.random.choice need not make use of numpy.random.randint in theory.特别是, numpy.random.choice的算法理论上不需要使用numpy.random.randint What matters is that these equivalent ways should produce the same distribution of random variates.重要的是这些等效方法应该产生相同的随机变量分布 In the case of NumPy, look at NumPy's source code . NumPy的情况,看NumPy的源码

Another, less important, reason for different results is that the two different selection procedures ( randint and choice ) produce pseudorandom numbers themselves, which can differ from each other because the selection procedures didn't begin with the same seed (more precisely, the same sequence of pseudorandom numbers).另一个不太重要的不同结果的原因是两个不同的选择程序( randintchoice )本身会产生伪随机数,它们可能彼此不同,因为选择程序不是从相同的种子开始的(更准确地说,相同的伪随机数序列)。 If we set the seed to the same value before beginning each procedure:如果我们在开始每个过程之前将种子设置为相同的值:

np.random.seed(335)
y=np.random.normal(0,1,5)
b=np.empty(len(y))
np.random.seed(999999)  # Seed selection procedure 1
for j in range(len(y)):
    a = np.random.randint(1,len(y))
    b[j] = y[a-1]
np.random.seed(999999)  # Seed selection procedure 2
c = np.random.choice(y, size=5)
print(b)
print(c)

then each procedure will begin with the same pseudorandom numbers.然后每个过程将以相同的伪随机数开始 But even so, the two procedures may use different algorithms for random selection, and these differences may still lead to different results.但即便如此,这两个程序可能使用不同的算法进行随机选择,这些差异仍然可能导致不同的结果。

(However, numpy.random.* functions, such as randint and choice , have become legacy functions as of NumPy 1.17, and their algorithms are expected to remain as they are for backward compatibility reasons. That version didn't deprecate any numpy.random.* functions, however, so they are still available for the time being. See also this question . In newer applications you should make use of the new system introduced in version 1.17, including numpy.random.Generator , if you have that version or later. One advantage of the new system is that the application relies less on global state.) (However, numpy.random.* functions, such as randint and choice , have become legacy functions as of NumPy 1.17, and their algorithms are expected to remain as they are for backward compatibility reasons. That version didn't deprecate any numpy.random.*功能,但是,因此它们暂时仍然可用。另请参阅此问题。在较新的应用程序中,您应该使用版本 1.17 中引入的新系统,包括numpy.random.Generator ,如果您有该版本或新系统的一个优点是应用程序对全局 state 的依赖较少。)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么我在循环返回和打印时得到不同的结果? - Why I am getting different results for returning versus printing in a loop? 为什么对于类似的查询我有2个不同的结果? - Why am I getting 2 different results for similar queries? Python:为什么我得到不同的排序结果 - Python: Why am I getting different sorted results 为什么在使用带有asyncio的协同程序的列表解析时会得到不同的结果? - Why am I getting different results when using a list comprehension with coroutines with asyncio? 为什么使用相同的 Keras 模型和输入进行预测会得到不同的结果? - Why am I getting different results on a prediction using the same Keras model and input? 为什么在 CNN 的 output 层中使用 softmax 而不是 sigmoid 时得到截然不同的结果? - Why am I getting drastically different results when using softmax instead of sigmoid in the output layer in CNN? 为什么我得到这些不同的输出? - Why I am getting these different outputs? 当我在Python控制台中运行此代码时,为什么会得到不同的结果? - Why I am getting different results when I run this code in my Python console? 装饰者:为什么我得到不同的结果? 如果我要传递带有参数的函数,@符号是强制性的吗? - Decorators: Why am I getting different results? If I'm passing a function with parameters, is @ notation compulsory? 为什么多次扫描自己的IP端口后得到不同的结果 - Why am i getting different results after port scanning my own IP multiple times
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM