pad 2d arrays in order to concatenate them

Question

this is probably a very basic question, but i struggle to get the math right. I have a list with arrays of different sizes. The shapes look like so:

(30, 300)
(7, 300)
(16, 300)
(10, 300)
(12, 300)
(33, 300)
(5, 300)
(11, 300)
(18, 300)
(31, 300)
(11, 300)

I want to use them as a feature in textclassification, this is why I need to concatenate them into one big matrix, which is not possible because of the different shapes. My idea was to pad the with zeros, such that they all have the shape (33,300) but i'm not sure how to that. I tried this:

padded_arrays = []
for p in np_posts:
    padded_arrays.append(numpy.pad(p,(48,0),'constant',constant_values = (0,0)))

which resulted in

(78, 348)
(55, 348)
(64, 348)
(58, 348)
(60, 348)
(81, 348)
(53, 348)
(59, 348)
(66, 348)
(79, 348)
(59, 348)

Please help me

Answer 1

You need to specify the padding for each edge of each dimension . The padding size is a fixed difference to the shape, thus you have to adapt it to the "missing" size:

np.pad(p, ((0, 33 - p.shape[0]), (0, 0)), 'constant', constant_values=0)

(0, 33 - p.shape[0]) pads the first dimension to the right edge (appending cells), while not padding the left edge (prepending).

(0, 0) disables padding of the second dimension, leaving its size as it is (300-> 300).

pad 2d arrays in order to concatenate them

Question

1 answers

solution1
1 ACCPTED 2021-01-14 10:04:12

pad 2d arrays in order to concatenate them

Question

1 answers

solution1 1 ACCPTED 2021-01-14 10:04:12

solution1
1 ACCPTED 2021-01-14 10:04:12