numpy dtype ValueError：固定类型元组中的无效形状 - 我该如何解决？

Question

I use a custom datatype, eg datatype = np.dtype('({:n},{:n})f4'.format(10000,100000)) to read data from a binary file using我使用自定义数据类型，例如datatype = np.dtype('({:n},{:n})f4'.format(10000,100000))使用从二进制文件读取数据

np.fromfile(filename, dtype=datatype)

However, defining the datatype using np.dtype gives an error for large datasets, as in the example datatype above:但是，使用 np.dtype 定义数据类型会为大型数据集带来错误，如上面的示例数据类型所示：

ValueError: invalid shape in fixed-type tuple: dtype size in bytes must fit into a C int

Initializing an array of that size is no problem: a=np.zeros((10000,100000)) .初始化该大小的数组没有问题： a=np.zeros((10000,100000)) 。 So my question is: Where does that limitation come from and how can I get around it?所以我的问题是：这个限制从何而来，我该如何绕过它？ I can of course use a loop and read chunks at a time, but maybe there is a more elegant way?我当然可以使用循环并一次读取块，但也许有更优雅的方式？

Answer 1

When you specify a dtype of '(M, N)f4' you are effectively specifying the final two dimensions of the output array, eg当您指定'(M, N)f4'的 dtype 时'(M, N)f4'您实际上是在指定输出数组的最后两个维度，例如

np.zeros(5, np.dtype('(6, 7)f4')).shape
# (5, 6, 7)

You could achieve the same outcome by simply reading in the data as a 1D array, then reshaping it to your desired shape:您可以通过简单地将数据作为一维数组读取，然后将其整形为您想要的形状来获得相同的结果：

x = np.fromfile(filename, np.float32).reshape(-1, 10000, 100000)

numpy dtype ValueError：固定类型元组中的无效形状 - 我该如何解决？

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-10-20 16:15:27

numpy dtype ValueError：固定类型元组中的无效形状 - 我该如何解决？

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-10-20 16:15:27

解决方案1
2 已采纳 2015-10-20 16:15:27