简体   繁体   English

Jython随机模块为cpython产生不同的结果

[英]Jython random module produces different results to cpython

I'm generating some test data using a known random seed. 我正在使用已知的随机种子生成一些测试数据。 I want to use this data from cpython and from jython. 我想从cpython和jython中使用这些数据。 I've found that the data is different if I use jython (2.5.2) vs cpython. 我发现如果使用jython(2.5.2)和cpython,数据会有所不同。

Boiling it down to a simple test, I can see that the PRNG is giving different results in the two implementations: 将其归结为一个简单的测试,我可以看到PRNG在两个实现中给出了不同的结果:

In Jython: 在Jython中:

Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011, 23:12:06) 
[Java HotSpot(TM) Server VM (Sun Microsystems Inc.)] on java1.6.0_26
Type "help", "copyright", "credits" or "license" for more information.
>>> import random
>>> random.seed(1)
>>> random.random()
0.7308781974052877

In CPython: 在CPython中:

Python 2.7.2+ (default, Oct  4 2011, 20:03:08) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import random
>>> random.seed(1)
>>> random.random()
0.13436424411240122

The test data I'm generating is reproducible within each python implementation. 我正在生成的测试数据在每个python实现中都是可重现的。 Is there a way around this? 有没有解决的办法? Maybe I need to code my own PRNG? 也许我需要编写自己的PRNG代码?

There is a way around this. 有一种解决方法。 Both implementations include the pure-python "WichmannHill" PRNG. 两种实现都包括纯蟒蛇“WichmannHill”PRNG。 It's slower but it gives the same results in both Jython and CPython. 它的速度较慢,但​​它在Jython和CPython中都给出了相同的结果。

In my code I replaced 在我的代码中我替换了

random.seed(1)
uuid += random.choice(hexdigits)

with

rand = random.WichmannHill(1)
uuid += rand.choice(hexdigits)

As said by delnan in a comment: It is not a surprise that different python interpreters generate different random sequences. 正如delnan在评论中所说:不同的python解释器产生不同的随机序列并不奇怪。 The official documentation refers to the C implementation of an algorithm. 官方文档是指算法的C实现。 Other Python implementations may choose other algorithms. 其他Python实现可以选择其他算法。 In fact, the lowest common denominator might be the distribution of the produced random sequences. 实际上,最小公分母可能是所产生的随机序列的分布。

If you depend on pseudo-random sequences which can be reproduced across all Python interpreters you have to write your own pseudo-random number generator. 如果您依赖于可以在所有Python解释器中重现的伪随机序列,则必须编写自己的伪随机数生成器。 A linear feedback shift register may be a good start and relatively easy to understand. 线性反馈移位寄存器可能是一个良好的开端并且相对容易理解。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM