简体   繁体   English

如何在训练中结合 2 个不同形状的 pytorch 张量?

[英]How to combine 2 different shaped pytorch tensors in training?

At the moment my model gives 3 output tensors.目前我的 model 给出了 3 个 output 张量。 I want two of them to be more cooperative.我希望他们两个更合作。 I want to use the combination of self.dropout1(hs) and self.dropout2(cls_hs) to pass through the self.entity_out Linear Layer.我想使用 self.dropout1(hs) 和 self.dropout2(cls_hs) 的组合来通过 self.entity_out 线性层。 The issue is mentioned 2 tensors are in different shapes.提到了这个问题 2 张量的形状不同。

Current Code当前代码

class NLUModel(nn.Module):
def __init__(self, num_entity, num_intent, num_scenarios):
    super(NLUModel, self).__init__()
    self.num_entity = num_entity
    self.num_intent = num_intent
    self.num_scenario = num_scenarios

    self.bert = transformers.BertModel.from_pretrained(config.BASE_MODEL)

    self.dropout1 = nn.Dropout(0.3)
    self.dropout2 = nn.Dropout(0.3)
    self.dropout3 = nn.Dropout(0.3)

    self.entity_out = nn.Linear(768, self.num_entity)
    self.intent_out = nn.Linear(768, self.num_intent)
    self.scenario_out = nn.Linear(768, self.num_scenario)

def forward(self, ids, mask, token_type_ids):
    out = self.bert(input_ids=ids, attention_mask=mask,
                    token_type_ids=token_type_ids)

    hs, cls_hs = out['last_hidden_state'], out['pooler_output']

    entity_hs = self.dropout1(hs)
    intent_hs = self.dropout2(cls_hs)
    scenario_hs = self.dropout3(cls_hs)

    entity_hs = self.entity_out(entity_hs)
    intent_hs = self.intent_out(intent_hs)
    scenario_hs = self.scenario_out(scenario_hs)

    return entity_hs, intent_hs, scenario_hs

Required必需的

def forward(self, ids, mask, token_type_ids):
    out = self.bert(input_ids=ids, attention_mask=mask,
                    token_type_ids=token_type_ids)

    hs, cls_hs = out['last_hidden_state'], out['pooler_output']

    entity_hs = self.dropout1(hs)
    intent_hs = self.dropout2(cls_hs)
    scenario_hs = self.dropout3(cls_hs)

    entity_hs = self.entity_out(concat(entity_hs, intent_hs)) # Concatination
    intent_hs = self.intent_out(intent_hs)
    scenario_hs = self.scenario_out(scenario_hs)

    return entity_hs, intent_hs, scenario_hs

Let's say I was successful in concatenating... will the backward propagation work?假设我成功连接了......反向传播会起作用吗?

Shape of entity_hs (last_hidden_state) is [batch_size, sequence_length, hidden_size], and shape of intent_hs (pooler_output) is just [batch_size, hidden_size] and putting them together may not make sense. entity_hs (last_hidden_state) 的形状是 [batch_size, sequence_length, hidden_size],intent_hs (pooler_output) 的形状只是 [batch_size, hidden_size] 将它们放在一起可能没有意义。 It depends on what you want to do.这取决于你想做什么。

If, for some reason, you want to get output [batch_size, sequence_length, channels], you could tile the intent_hs tensor:如果出于某种原因,您想要获得 output [batch_size, sequence_length, channels],您可以平铺 intent_hs 张量:

intent_hs = torch.tile(intent_hs[:, None, :], (1, sequence_lenght, 1))
... = torch.cat([entity_hs, intent_hs], dim=2) 

If you want to get [batch_size, channels], you can reduce the entity_hs tensor for example by averaging:如果你想得到 [batch_size, channels],你可以减少 entity_hs 张量,例如通过平均:

entity_hs = torch.mean(entity_hs, dim=1) 
... = torch.cat([entity_hs, intent_hs], dim=1) 

Yes, the the backward pass will propagate gradients through the concatenation (and the rest).是的,反向传播将通过连接(和其余部分)传播梯度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM