如何使用tensor2tensor对文本进行分类？

Question

I want to do binary text classification using tensor2tensor only with attention and no LSTM or CNN preprocessing layers. 我想使用tensor2tensor进行二进制文本分类，只注意并且没有LSTM或CNN预处理层。 I think that the transformer_encoder model is the best for me，but I can't find any required predifined Problem or Hparams. 我认为transformer_encoder模型对我来说是最好的，但我找不到任何必需的预测问题或Hparams。 Can anyone give me a text classification example using tensor2tensor or some other advice? 任何人都可以使用tensor2tensor或其他一些建议给我一个文本分类示例吗？

Answer 1

I would recommend following their sentiment_imdb problem, since sentiment analysis is a text-classification problem: 我建议遵循他们的sentiment_imdb问题，因为情绪分析是一个文本分类问题：

https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/imdb.py https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/data_generators/imdb.py

They also have a brief section about training a transformer_encoder for this problem on the main page: 他们还有一个关于在主页面上针对此问题训练transformer_encoder的简短部分：

https://github.com/tensorflow/tensor2tensor#sentiment-analysis https://github.com/tensorflow/tensor2tensor#sentiment-analysis

Answer 2

Try this 尝试这个

PROBLEM= sentiment_imdb
MODEL= transformer_encoder
HPARAMS=transformer_tiny

DATA_DIR=$HOME/t2t_data
TMP_DIR=/tmp/t2t_datagen
TRAIN_DIR=$HOME/t2t_train/$PROBLEM/$MODEL-$HPARAMS

mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR

# Generate data
t2t-datagen \
  --data_dir=$DATA_DIR \
  --tmp_dir=$TMP_DIR \
  --problem=$PROBLEM

# Train
# *  If you run out of memory, add --hparams='batch_size=1024'.
t2t-trainer \
  --data_dir=$DATA_DIR \
  --problem=$PROBLEM \
  --model=$MODEL \
  --hparams_set=$HPARAMS \
  --output_dir=$TRAIN_DIR

如何使用tensor2tensor对文本进行分类？

问题描述

2 个解决方案

解决方案1
3 2018-02-16 15:45:28

解决方案2
0 2018-12-06 02:41:24

如何使用tensor2tensor对文本进行分类？

问题描述

2 个解决方案

解决方案1 3 2018-02-16 15:45:28

解决方案2 0 2018-12-06 02:41:24

解决方案1
3 2018-02-16 15:45:28

解决方案2
0 2018-12-06 02:41:24