Combining Numpy Arrays Feature Matrices In Python

Question

I have the data with 38910 rows and 2 columns . As its a string data, so I used two feature creation methods A and B.

Method A gives me data of numpy arrays of the shape as:

a.shape = (38910, 17, 21)

Method B gives me data of numpy arrays of the shape as:

b.shape = (38910, 16, 441)

Now, for applying Convolution Neural Network and other methods, I need to combine both the features to make a numpy array of the shape = (38910, 17, 21, 16, 441) . What is the best way I can do that such that I don't face memory issues.

Answer 1

One one to avoid memory issues is to process the rows in batches. Assuming that you have a function combine_features(a, b) that combines the outputs of method A and method B, here's a rough outline of a solution:

a_batches = np.array_split(a, 500)
b_batches = np.array_split(b, 500)
for i, batch in enumerate(zip(a_batches, b_batches)):
  a_batch, b_batch = batch
  output = combine_features(a_batch, b_batch)
  np.save(f"{destination_folder}/data-{i}.npy", output)

Then as you are training, you can iterate through the saved files and load one at a time.

Combining Numpy Arrays Feature Matrices In Python

Question

1 answers

solution1
0 2020-12-11 19:07:42

Combining Numpy Arrays Feature Matrices In Python

Question

1 answers

solution1 0 2020-12-11 19:07:42

solution1
0 2020-12-11 19:07:42