[英]How to use tensor2tensor to classify text?
I want to do binary text classification using tensor2tensor only with attention and no LSTM or CNN preprocessing layers. 我想使用tensor2tensor进行二进制文本分类,只注意并且没有LSTM或CNN预处理层。 I think that the transformer_encoder model is the best for me,but I can't find any required predifined Problem or Hparams.
我认为transformer_encoder模型对我来说是最好的,但我找不到任何必需的预测问题或Hparams。 Can anyone give me a text classification example using tensor2tensor or some other advice?
任何人都可以使用tensor2tensor或其他一些建议给我一个文本分类示例吗?
I would recommend following their sentiment_imdb
problem, since sentiment analysis is a text-classification problem: 我建议遵循他们的
sentiment_imdb
问题,因为情绪分析是一个文本分类问题:
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/imdb.py https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/imdb.py
They also have a brief section about training a transformer_encoder
for this problem on the main page: 他们还有一个关于在主页面上针对此问题训练
transformer_encoder
的简短部分:
https://github.com/tensorflow/tensor2tensor#sentiment-analysis https://github.com/tensorflow/tensor2tensor#sentiment-analysis
Try this 尝试这个
PROBLEM= sentiment_imdb
MODEL= transformer_encoder
HPARAMS=transformer_tiny
DATA_DIR=$HOME/t2t_data
TMP_DIR=/tmp/t2t_datagen
TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS
mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR
# Generate data
t2t-datagen \
--data_dir=$DATA_DIR \
--tmp_dir=$TMP_DIR \
--problem=$PROBLEM
# Train
# * If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer \
--data_dir=$DATA_DIR \
--problem=$PROBLEM \
--model=$MODEL \
--hparams_set=$HPARAMS \
--output_dir=$TRAIN_DIR
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.