简体   繁体   English

pad 2d arrays 以连接它们

[英]pad 2d arrays in order to concatenate them

this is probably a very basic question, but i struggle to get the math right.这可能是一个非常基本的问题,但我很难把数学弄好。 I have a list with arrays of different sizes.我有一个包含不同大小的 arrays 的列表。 The shapes look like so:形状看起来像这样:

(30, 300)
(7, 300)
(16, 300)
(10, 300)
(12, 300)
(33, 300)
(5, 300)
(11, 300)
(18, 300)
(31, 300)
(11, 300)

I want to use them as a feature in textclassification, this is why I need to concatenate them into one big matrix, which is not possible because of the different shapes.我想将它们用作文本分类中的一个特征,这就是为什么我需要将它们连接成一个大矩阵,这是不可能的,因为形状不同。 My idea was to pad the with zeros, such that they all have the shape (33,300) but i'm not sure how to that.我的想法是用零填充,这样它们都具有(33,300)的形状,但我不确定该怎么做。 I tried this:我试过这个:

padded_arrays = []
for p in np_posts:
    padded_arrays.append(numpy.pad(p,(48,0),'constant',constant_values = (0,0)))

which resulted in这导致

(78, 348)
(55, 348)
(64, 348)
(58, 348)
(60, 348)
(81, 348)
(53, 348)
(59, 348)
(66, 348)
(79, 348)
(59, 348)

Please help me请帮我

You need to specify the padding for each edge of each dimension .您需要为每个维度每个边缘指定填充。 The padding size is a fixed difference to the shape, thus you have to adapt it to the "missing" size:填充大小是形状的固定差异,因此您必须使其适应“缺失”的大小:

np.pad(p, ((0, 33 - p.shape[0]), (0, 0)), 'constant', constant_values=0)

(0, 33 - p.shape[0]) pads the first dimension to the right edge (appending cells), while not padding the left edge (prepending). (0, 33 - p.shape[0])将第一个维度填充到右边缘(附加单元格),而不填充左边缘(前置)。

(0, 0) disables padding of the second dimension, leaving its size as it is (300-> 300). (0, 0)禁用第二个维度的填充,保持其大小不变 (300-> 300)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM