[英]numpy dtype ValueError: invalid shape in fixed-type tuple - how can I get around it?
I use a custom datatype, eg datatype = np.dtype('({:n},{:n})f4'.format(10000,100000))
to read data from a binary file using我使用自定义数据类型,例如datatype = np.dtype('({:n},{:n})f4'.format(10000,100000))
使用从二进制文件读取数据
np.fromfile(filename, dtype=datatype)
However, defining the datatype using np.dtype gives an error for large datasets, as in the example datatype above:但是,使用 np.dtype 定义数据类型会为大型数据集带来错误,如上面的示例数据类型所示:
ValueError: invalid shape in fixed-type tuple: dtype size in bytes must fit into a C int
Initializing an array of that size is no problem: a=np.zeros((10000,100000))
.初始化该大小的数组没有问题: a=np.zeros((10000,100000))
。 So my question is: Where does that limitation come from and how can I get around it?所以我的问题是:这个限制从何而来,我该如何绕过它? I can of course use a loop and read chunks at a time, but maybe there is a more elegant way?我当然可以使用循环并一次读取块,但也许有更优雅的方式?
When you specify a dtype of '(M, N)f4'
you are effectively specifying the final two dimensions of the output array, eg当您指定'(M, N)f4'
的 dtype 时'(M, N)f4'
您实际上是在指定输出数组的最后两个维度,例如
np.zeros(5, np.dtype('(6, 7)f4')).shape
# (5, 6, 7)
You could achieve the same outcome by simply reading in the data as a 1D array, then reshaping it to your desired shape:您可以通过简单地将数据作为一维数组读取,然后将其整形为您想要的形状来获得相同的结果:
x = np.fromfile(filename, np.float32).reshape(-1, 10000, 100000)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.