简体   繁体   English

python 2.7随机抽样导致内存错误

[英]python 2.7 random sampling causes memory error

random.sample(range(2**31 - 1), random.randrage(1, 100))

This results in: 结果是:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError

I'm running python 2.7.3 on ubuntu 12.04 64-bit with 6GB RAM. 我正在具有6GB RAM的Ubuntu 12.04 64位上运行python 2.7.3。

I thought 2**31 -1 was the normal upper limit for integers in a 32-bit computer. 我认为2 ** 31 -1是32位计算机中整数的正常上限。 I'm still 1 below that and I'm getting a memory error? 我仍然比那还低1,并且遇到内存错误?

You are probably referring to the limit on integers in languages like C where an int is usually 4 bytes. 您可能指的是C语言中对整数的限制,其中int通常为4个字节。 In Python 2.7 integers have no limit and are automatically promoted to a larger type when needed, holding infinite precision. 在Python 2.7中,整数没有限制,并在需要时自动提升为更大的类型,并保持无限精度。 Your problem is not directly related to this, you are trying to create a list of each number from 0 to 2**31-1 which will require a very large amount of memory, thus the MemoryError . 您的问题与此不直接相关,您尝试创建一个从02**31-1的每个数字的列表,这将需要非常大的内存,因此MemoryError

You do not need to create this list, just use xrange which evaluates lazily: 您无需创建此列表,只需使用xrange进行延迟计算:

>>> random.sample(xrange(2**31 - 1), random.randrange(1, 100))
[248934843, 2102394624, 1637327281, 352636013, 2045080621, 1235828868, 951394286, 69276138, 2116665801, 312166638, 563309004, 546628308, 1890125579, 699765051, 1567057622, 656971854, 827087462, 1306546407, 1071615348, 601863879, 410052433, 1932949412, 1562548682, 1772850083, 1960078129, 742789541, 175226686, 1513744635, 512485864, 2074307325, 798261810, 1957338925, 1414849089, 1375678508, 1387543446, 1394012809, 951027358, 1169817473, 983989994, 1340435411, 823577680, 1078179464, 2051607389, 180372351, 552409811, 1830127563, 1007051104, 2112306498, 1936077584, 2133304389, 1853748484, 437846528, 636724713, 802267101, 1398481637, 15781631, 2139969958, 126055150, 1997111252, 1666405929, 1368202177, 1957578568, 997110500, 1195347356, 1595947705, 2138216956, 768776647, 128275761, 781344096, 941120866, 1195352770, 1244713418, 930603490, 2147048666, 2029878314, 2030267692, 1471665936, 1763093479, 218699995, 1140089127, 583812185, 1405032224, 851203055, 1454295991, 1105558477, 2118065369, 1864334655, 1792789380, 386976269, 1213322292, 663210178, 1402712466, 1564065247]

xrange is not a generator - it's a sequence object that evaluates lazily. xrange 不是生成器 -它是一个懒惰求值的序列对象。 It supports almost all of the same methods as range but doesn't need to store its data inside a list. 它几乎支持与range相同的所有方法,但不需要将其数据存储在列表中。 (In Python 3 it's simply called range ). (在Python 3中,它简称为range )。 Therefore it has a length and random.sample can access this instantly to pick k items from it. 因此它具有长度和random.sample可以立即访问它以从中选择k项目。

Note: With regards to 2**31-1 (on 32 bit, sys.maxint for your specific system) being the largest number, you were correct in a sense since xrange has a limit on the upper bound (this doesn't mean Python int has a max size though) also note that Python 3 fixes this with the normal range . 注意:关于2**31-1 (在32位,特定系统为sys.maxint )是最大的数字,在某种意义上您是正确的,因为xrange在上限上有限制(这并不意味着虽然Python int具有最大大小),但也请注意,Python 3使用正常range修复了此问题。

>>> random.sample(xrange(2**31), random.randrange(1, 100))

Traceback (most recent call last):
  File "<pyshell#16>", line 1, in <module>
    random.sample(xrange(2**31), random.randrange(1, 100))
OverflowError: Python int too large to convert to C long

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM