具有不一致（重复）填充的 RNN（使用 Pytorch 的 Pack_padded_sequence）

Question

Following the example from PyTorch docs I am trying to solve a problem where the padding is inconsistent rather than at the end of the tensor for each batch (in other words, no pun intended, I have a left-censored and right-censored problem across my batches):按照PyTorch 文档中的示例，我试图解决填充不一致的问题，而不是在每个批次的张量末尾（换句话说，没有双关语，我有一个左删失和右删失问题我的批次）：

 # Data structure example from docs
seq = torch.tensor([[1,2,0], [3,0,0], [4,5,6]])
 # Data structure of my problem
inconsistent_seq = torch.tensor([[1,2,0], [0,3,0], [0,5,6]])

lens = ...?
packed = pack_padded_sequence(seq, lens, batch_first=True, enforce_sorted=False)

How can I solve the problem of masking these padded 0's when running them through an LSTM using (preferably) PyTorch functionality?当使用（最好）PyTorch 功能通过 LSTM 运行它们时，如何解决屏蔽这些填充 0 的问题？

Answer 1

I "solved" this by essentially reindexing my data and padding left-censored data with 0's (makes sense for my problem).我通过基本上重新索引我的数据并用 0 填充左删失数据来“解决”这个问题（对我的问题有意义）。 I also injected and extra tensor to the input dimension to track this padding.我还向输入维度注入了额外的张量来跟踪这个填充。 I then masked the right-censored data using the pack_padded_sequence method from the PyTorch library.然后，我使用 PyTorch 库中的 pack_padded_sequence 方法屏蔽了右删失数据。 Found a good source here:在这里找到了一个很好的来源：

https://www.kdnuggets.com/2018/06/taming-lstms-variable-sized-mini-batches-pytorch.html https://www.kdnuggets.com/2018/06/taming-lstms-variable-sized-mini-batches-pytorch.html

具有不一致（重复）填充的 RNN（使用 Pytorch 的 Pack_padded_sequence）

问题描述

1 个解决方案

解决方案1
0 2022-02-02 12:52:40

具有不一致（重复）填充的 RNN（使用 Pytorch 的 Pack_padded_sequence）

问题描述

1 个解决方案

解决方案1 0 2022-02-02 12:52:40

解决方案1
0 2022-02-02 12:52:40