I've tried this method outlined by Hpaulji but it doesn't seem to working:
How to append many numpy files into one numpy file in python
Basically, I'm iterating through a generator, making some changes to an array, and then trying to save the each iteration's array.
Here is what my sample code looks like:
filename = 'testing.npy'
with open(filename, 'wb') as f:
for x, _ in train_generator:
prediction = base_model.predict(x)
print(prediction[0,0,0,0:5])
np.save(filename, prediction)
current_iteration += 1
if current_iteration == 5:
break
Here, I'm going through 5 iterations, so I was hoping to save 5 different arrays.
I printed out a portion of each array, for debugging purposes:
[ 0. 0. 0. 0. 0.]
[ 0. 3.37349415 0. 0. 1.62561738]
[ 0. 20.28489304 0. 0. 0. ]
[ 0. 0. 0. 0. 0.]
[ 0. 21.98013496 0. 0. 0. ]
But when I tried to load the array, multiple times as noted here, How to append many numpy files into one numpy file in python , I'm getting an EOFERROR:
file = r'testing.npy'
with open(file,'rb') as f:
arr = np.load(f)
print(arr[0,0,0,0:5])
arr = np.load(f)
print(arr[0,0,0,0:5])
It's only outputting the last array and then an EOFERROR:
[ 0. 21.98013496 0. 0. 0. ]
EOFError: Ran out of input
print(arr[0,0,0,0:5])
I was expection all 5 arrays to be saved, but when I load the save .npy file multiple times, I only get the last array.
So, how should I be saving saving and appending new array to a file?
EDIT: Testing with '.npz' only saves last array
filename = 'testing.npz'
current_iteration = 0
with open(filename, 'wb') as f:
for x, _ in train_generator:
prediction = base_model.predict(x)
print(prediction[0,0,0,0:5])
np.savez(f, prediction)
current_iteration += 1
if current_iteration == 5:
break
#loading
file = 'testing.npz'
with open(file,'rb') as f:
arr = np.load(f)
print(arr.keys())
>>>['arr_0']
All your calls to np.save
use the filename, not the filehandle. Since you do not reuse the filehandle, each save overwrites the file instead of appending the array to it.
This should work:
filename = 'testing.npy'
with open(filename, 'wb') as f:
for x, _ in train_generator:
prediction = base_model.predict(x)
print(prediction[0,0,0,0:5])
np.save(f, prediction)
current_iteration += 1
if current_iteration == 5:
break
And while there may be advantages to storing multiple arrays in one .npy
file (I imagine advantages in situations where memory is limited), they are technically meant to store one single array, and you can use .npz
files ( np.savez
or np.savez_compressed
) to store multiple arrays:
filename = 'testing.npz'
predictions = []
for (x, _), index in zip(train_generator, range(5)):
prediction = base_model.predict(x)
predictions.append(prediction)
np.savez(filename, predictions) # will name it arr_0
# np.savez(filename, predictions=predictions) # would name it predictions
# np.savez(filename, *predictions) # would name it arr_0, arr_1, …, arr_4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.