简体   繁体   English

长句子对深度学习模型不利吗?

[英]Are long sentences not good for deep learning models?

Interested to know if long sentences are good for tensor2tensor model training. 有兴趣知道长句子是否适合张量张量模型训练。 And why or why not? 为什么或为什么不呢?

Ideally, the training data should have the same distribution of sentence lengths as the target test data. 理想情况下,训练数据应与目标测试数据具有相同的句子长度分布。 Eg in machine translation, if long sentences are intended to be translated by the final model, similarly long sentences should be used also for training. 例如,在机器翻译中,如果打算由最终模型翻译长句子,则类似的长句子也应用于培训。 The Transformer model seems to not generalize to longer sentences than were used for training, but limiting the maximum sentence length in training allows to use higher batch sizes, which is helpful ( Popel and Bojar, 2018 ). Transformer模型似乎并不能推广到比用于训练的句子更长的句子,但是限制训练中的最大句子长度允许使用更高的批处理大小,这很有帮助( Popel和Bojar,2018 )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM