I am trying to concatenate many numpy arrays, I put each array in one file, In fact the problem that I have a lot of files, Memory can't support to create a big array Data_Array = np.zeros((1000000,7000))
, where I will put all my files. So, I found in this question Combining NumPy arrays that I can use np.concatenate
:
file1= np.load('file1_Path.npy')
file2= np.load('file2_Path.npy')
file3= np.load('file3_Path.npy')
file4= np.load('file4_Path.npy')
dataArray=np.concatenate((file1, file2, file3, file4), axis=0)
test= dataArray.shape
print(test)
print (dataArray)
print (dataArray.shape)
plt.plot(dataArray.T)
plt.show()
This way gives me a very good result, but now, I need to replace file1, file2, file3, file4
by the path to the folder of my files:
import matplotlib.pyplot as plt
import numpy as np
import glob
import os, sys
fpath ="Path_To_Big_File"
npyfilespath =r'Path_To_Many_Numpy_Files'
os.chdir(npyfilespath)
npfiles= glob.glob("*.npy")
npfiles.sort()
for i,npfile in enumerate(npfiles):
dataArray=np.concatenate(npfile, axis=0)
np.save(fpath, all_arrays)
It gives me this error:
np.concatenate(npfile, axis=0)
ValueError: zero-dimensional arrays cannot be concatenated
Could you please help me to make this method np.concatenate
works?
If you wish to use large arrays, just use np.memmap instead of loading the data into memory. The advantage of memmap is that data is always saved to disk when necessary. For example, you can create a memory mapped array in the following way:
import numpy as np
a=np.memmap('myFile',dtype=np.int,mode='w+',shape=(1000000,8000))
You can then use 'a' as a normal numpy array. The limit is then your hard disk ! This creates a file on your hard disk that you can read later. You just change mode to 'r' and read data from the array. More info about memmap here: https://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html
In order to fill that array from npy files of shape (1,8000), just write:
for i,npFile in enumerate(npfFiles):
a[i,:]=np.load(npFile)
a.flush()
The flush method insures everything has been written on disk
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.