简体   繁体   中英

pad 2d arrays in order to concatenate them

this is probably a very basic question, but i struggle to get the math right. I have a list with arrays of different sizes. The shapes look like so:

(30, 300)
(7, 300)
(16, 300)
(10, 300)
(12, 300)
(33, 300)
(5, 300)
(11, 300)
(18, 300)
(31, 300)
(11, 300)

I want to use them as a feature in textclassification, this is why I need to concatenate them into one big matrix, which is not possible because of the different shapes. My idea was to pad the with zeros, such that they all have the shape (33,300) but i'm not sure how to that. I tried this:

padded_arrays = []
for p in np_posts:
    padded_arrays.append(numpy.pad(p,(48,0),'constant',constant_values = (0,0)))

which resulted in

(78, 348)
(55, 348)
(64, 348)
(58, 348)
(60, 348)
(81, 348)
(53, 348)
(59, 348)
(66, 348)
(79, 348)
(59, 348)

Please help me

You need to specify the padding for each edge of each dimension . The padding size is a fixed difference to the shape, thus you have to adapt it to the "missing" size:

np.pad(p, ((0, 33 - p.shape[0]), (0, 0)), 'constant', constant_values=0)

(0, 33 - p.shape[0]) pads the first dimension to the right edge (appending cells), while not padding the left edge (prepending).

(0, 0) disables padding of the second dimension, leaving its size as it is (300-> 300).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM