简体   繁体   English

python多维布尔数组?

[英]python multidimensional boolean array?

it would contain at most 1000 x 1000 x 1000 elements, which is too big for python dictionary. 它最多包含1000 x 1000 x 1000个元素,这对于python字典来说太大了。

with dict, around 30 x 1000 x 1000 elements, on my machine it already consumed 2gb of memory and everything got stoned. 使用dict,大约30 x 1000 x 1000个元素,在我的机器上它已经消耗了2GB的内存并且一切都被扔石头了。

any modules that can handle 3-dimension array whose value would be only True/False? 任何可以处理三维数组的模块,其值只有True / False? I check bitarray http://pypi.python.org/pypi/bitarray , which seems reasonable and coded in C, however it seems more like a bit-stream instead of an array, since it supports only 1 dimension. 我检查了bitarray http://pypi.python.org/pypi/bitarray ,它似乎合理并用C编码,但它似乎更像是一个比特流而不是一个数组,因为它只支持一维。

numpy is your friend: numpy是你的朋友:

import numpy as np
a = np.zeros((1000,1000,1000), dtype=bool)
a[1,10,100] = True

Has a memory footprint as little as possible. 内存占用尽可能少。

EDIT: 编辑:

If you really need you can also look at the defaultdict class container in the collections module, which doesn't store the values that are of the default value. 如果确实需要,还可以查看collections模块中的defaultdict类容器,该容器不存储默认值的值。 But if it's not really a must, use numpy. 但如果它不是必须的话,请使用numpy。

How about a list of lists of bitarrays, perhaps wrapped into your own class with a nice API? 如何使用一个很好的API包含在你自己的类中的bitarray列表呢?

Alternatively, an 3D NumPy array of integers, with your own code packing/unpacking multiple booleans into each integer. 或者,3D NumPy整数数组,您自己的代码将多个布尔值打包/解包到每个整数中。

numpy has already been suggested by EnricoGiampieri, and if you can use this, you should. numpy已经被EnricoGiampieri建议,如果你可以使用这个,你应该。

Otherwise, there are two choices: 否则,有两种选择:

A jagged array, as suggested by NPE, would be a list of list of bitarray s. 锯齿状排列,如通过NPE建议的,将是一个listlistbitarray秒。 This allows you to have jagged bounds—eg, each row could be a different width, or even independently resizable: 这允许您具有锯齿状边界 - 例如,每行可以是不同的宽度,或者甚至可以独立调整大小:

bits3d = [[bitarray.bitarray(1000) for y in range(1000)] for x in range(1000)]
myvalue = bits3d[x][y][z]

Alternatively, as suggested by Xymostech, do your own indexing on a 1-D array: 或者,正如Xymostech所建议的那样,在一维数组上进行自己的索引:

bits3d = bitarray.bitarray(1000*1000*1000)
myvalue = bits3d[x + y*1000 + z*1000*1000]

Either way, you'd probably want to wrap this up in a class, so you can do this: 不管怎样,你可能想把它包装在一个类中,所以你可以这样做:

bits3d = BitArray(1000, 1000, 1000)
myvalue = bits3d[x, y, z]

That's as easy as: 这很简单:

class Jagged3DBitArray(object):
    def __init__(self, xsize, ysize, zsize):
        self.lll = [[bitarray(zsize) for y in range(ysize)] 
                    for x in range(xsize)]
    def __getitem__(self, key):
        x, y, z = key
        return self.lll[x][y][z]
    def __setitem__(self, key, value):
        x, y, z = key
        self.lll[x][y][z] = value

class Fixed3DBitArray(object):
    def __init__(self, xsize, ysize, zsize):
        self.xsize, self.ysize, self.zsize = xsize, ysize, zsize
        self.b = bitarray(xsize * ysize * zsize)
    def __getitem__(self, key):
        x, y, z = key
        return self.b[x + y * self.ysize + z * self.ysize * self.zsize]
    def __setitem__(self, key, value):
        x, y, z = key
        self.b[x + y * self.ysize + z * self.ysize * self.zsize] = value

Of course if you want more functionality (like slicing), you have to write a bit more. 当然,如果你想要更多功能(比如切片),你必须多写一点。

The jagged array will use a bit more memory (after all, you have the overhead of 1M bitarray objects and 1K list objects), and may be a bit slower, but this usually won't make much difference. 锯齿状阵列将使用更多的内存(毕竟,你有1M bitarray对象和1K list对象的开销),并且可能会慢一点,但这通常不会有太大的区别。

The important deciding factor should be whether it's inherently an error for your data to have jagged rows. 重要的决定因素应该是您的数据是否具有锯齿状行的错误。 If so, use the second solution; 如果是,请使用第二种解决方案; if it might be useful to have jagged or resizable rows, use the former. 如果有锯齿状或可调整大小的行可能有用,请使用前者。 (Keeping in mind that I'd use numpy over either solution, if at all possible.) (请记住,如果可能的话,我会使用numpy而不是任何一种解决方案。)

How about getting inspired by unix file permissions? 如何获得unix文件权限的启发? 755 is read,write,execute for owner and read,execute for everyone else. 755为所有者读取,写入,执行并读取,为其他人执行。 This is because 7 translates to binary 111 . 这是因为7转换为二进制111

So your 1000x1000x1000 bool array could be a 1000x1000 list of int s in which the binary representation of each int gives you a 1000 "bit" string representing the bool array. 因此,您的1000x1000x1000 bool数组可能是一个1000x1000的int列表,其中每个int的二进制表示形式为您提供表示bool数组的1000“位”字符串。

All of that should fit in under 1GB of memory 所有这些都应该适合1GB以下的内存

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM