[英]RNN with inconsistent (repeated) padding (using Pytorch's Pack_padded_sequence)
Following the example from PyTorch docs I am trying to solve a problem where the padding is inconsistent rather than at the end of the tensor for each batch (in other words, no pun intended, I have a left-censored and right-censored problem across my batches):按照PyTorch 文档中的示例,我试图解决填充不一致的问题,而不是在每个批次的张量末尾(换句话说,没有双关语,我有一个左删失和右删失问题我的批次):
# Data structure example from docs
seq = torch.tensor([[1,2,0], [3,0,0], [4,5,6]])
# Data structure of my problem
inconsistent_seq = torch.tensor([[1,2,0], [0,3,0], [0,5,6]])
lens = ...?
packed = pack_padded_sequence(seq, lens, batch_first=True, enforce_sorted=False)
How can I solve the problem of masking these padded 0's when running them through an LSTM using (preferably) PyTorch functionality?当使用(最好)PyTorch 功能通过 LSTM 运行它们时,如何解决屏蔽这些填充 0 的问题?
I "solved" this by essentially reindexing my data and padding left-censored data with 0's (makes sense for my problem).我通过基本上重新索引我的数据并用 0 填充左删失数据来“解决”这个问题(对我的问题有意义)。 I also injected and extra tensor to the input dimension to track this padding.我还向输入维度注入了额外的张量来跟踪这个填充。 I then masked the right-censored data using the pack_padded_sequence method from the PyTorch library.然后,我使用 PyTorch 库中的 pack_padded_sequence 方法屏蔽了右删失数据。 Found a good source here:在这里找到了一个很好的来源:
https://www.kdnuggets.com/2018/06/taming-lstms-variable-sized-mini-batches-pytorch.html https://www.kdnuggets.com/2018/06/taming-lstms-variable-sized-mini-batches-pytorch.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.