简体繁体中英

How is BERT Layer sequence output used?

原文 2020-02-19 05:37:33 4 1 python/ tensorflow/ deep-learning

In the class DisasterDetector, in build_model(), clf_output = sequence_output[:, 0, :] . A sigmoid activation is then applied in order to generate the model output.

The location the BertLayer was obtained from on tfhub describes the shape of sequence_output as [batch_size, max_seq_length, 768] . Why are we choosing only the first index over the max_seq_length dimension (indexing a 0)? If this corresponds to only the first token in the output sequence, and not the other tokens, why is this used in the binary classification task?

1 answers

the first token of output sequence is from the first of input ，i e.[CLS]. the [CLS] is regarded as the represition of the whole input sequence. u can read the original paper to understand it better.

pooled output vs sequence output for NER with BERT

How do I interpret my BERT output from Huggingface Transformers for Sequence Classification and tensorflow?

How to apply random forests to the output produced by Bert?

How to convert bert model output to json?

FIneTunning BERT for sequence classification

How to access BERT intermediate layer outputs in TF Hub Module?

How to add pooling layer to BERT QA for large text

How to use a batch size bigger than zero in Bert Sequence Classification

BERT encoding layer produces same output for all inputs during evaluation (PyTorch)

Sequence labeling with BERT for words position

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question pooled output vs sequence output for NER with BERT How do I interpret my BERT output from Huggingface Transformers for Sequence Classification and tensorflow? How to apply random forests to the output produced by Bert? How to convert bert model output to json? FIneTunning BERT for sequence classification How to access BERT intermediate layer outputs in TF Hub Module? How to add pooling layer to BERT QA for large text How to use a batch size bigger than zero in Bert Sequence Classification BERT encoding layer produces same output for all inputs during evaluation (PyTorch) Sequence labeling with BERT for words position

Related Tags

How is BERT Layer sequence output used?

Question

1 answers

solution1 0 2020-05-01 10:57:06

solution1
0 2020-05-01 10:57:06