[英]Split a large numpy array into multiple numpy arrays?
I have a large numpy array with a size of 699720. Here is what the numpy array looks like with a shape of (4998, 140)我有一个大的 numpy 数组,大小为 699720。这是 numpy 数组的外观,形状为 (4998, 140)
[[-0.11252183 -2.8272038 -3.773897 ... 0.12343082 0.92528623
0.19313742]
[-1.1008778 -3.9968398 -4.2858424 ... 0.7738197 1.1196209
-1.4362499 ]
[-0.567088 -2.5934503 -3.8742297 ... 0.32109663 0.9042267
-0.4217966 ]
...
[-1.1229693 -2.252925 -2.867628 ... -2.874136 -2.0083694
-1.8083338 ]
[-0.54770464 -1.8895451 -2.8397787 ... 1.261335 1.1504486
0.80493224]
[-1.3517791 -2.2090058 -2.5202248 ... -2.2600229 -1.577823
-0.6845309 ]]
I would like to split the numpy array into 4 different numpy arrays. the first 3 would 30% of the numpy array.我想将 numpy 数组分成 4 个不同的 numpy arrays。前 3 个将占 numpy 数组的 30%。 eg numpyarray1 should be 0-30%, numpyarray2 should be 31-60%, numpyarray3 should be 61-90% and numpyarray4 should be 91-100% of the dataset.
例如,numpyarray1 应该是 0-30%,numpyarray2 应该是 31-60%,numpyarray3 应该是 61-90%,numpyarray4 应该是数据集的 91-100%。
You can achieve this with numpy.split() .您可以使用numpy.split()来实现。 This function gives you quite a lot of options to split the array accordingly.
这个 function 为您提供了相当多的选项来相应地拆分数组。 Note however that it gives you a view on the original array (so no new array is created which saves memory).
但是请注意,它为您提供了原始数组的视图(因此没有创建新数组以节省内存)。
See this example:看这个例子:
import numpy as np
arr = np.random.random((100, 100))
nr_rows = arr.shape[0]
# Get the indices for the first three sections
# You can do some fancy calculation for the first N sections
section_borders = [(i+1) * 3 * (nr_rows // 10) for i in range(3)]
# Do the splitting
arr_splits = np.split(arr, section_borders)
print([sarr.shape for sarr in arr_splits])
# --> [(30, 100), (30, 100), (30, 100), (10, 100)]
In case you want to split along columns, you can use the axis
parameter in the function to get that accordingly.如果您想沿列拆分,可以使用 function 中的
axis
参数来相应地获取它。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.