简体   繁体   English

使用 huggingface/transformers 预训练 model 与 SentenceTransformer 的句子转换器

[英]sentence transformer using huggingface/transformers pre-trained model vs SentenceTransformer

This page has two scripts页面有两个脚本

When should one use 1st method shown below vs 2nd?什么时候应该使用下面显示的第一种方法和第二种方法? As nli-distilroberta-base-v2 trained specially for finding sentence embedding wont that always be better than the first method?作为专门为查找句子嵌入而训练nli-distilroberta-base-v2不会总是比第一种方法更好吗?

training_stsbenchmark.py1 - training_stsbenchmark.py1 -

from sentence_transformers import SentenceTransformer,  LoggingHandler, losses, models, util
#You can specify any huggingface/transformers pre-trained model here, for example, bert-base-uncased, roberta-base, xlm-roberta-base
model_name = sys.argv[1] if len(sys.argv) > 1 else 'distilbert-base-uncased'

# Use Huggingface/transformers model (like BERT, RoBERTa, XLNet, XLM-R) for mapping tokens to embeddings
word_embedding_model = models.Transformer(model_name)

# Apply mean pooling to get one fixed sized sentence vector
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(),
                               pooling_mode_mean_tokens=True,
                               pooling_mode_cls_token=False,
                               pooling_mode_max_tokens=False)

model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

training_stsbenchmark_continue_training.py - training_stsbenchmark_continue_training.py -

from sentence_transformers import SentenceTransformer, LoggingHandler, losses, util, InputExample
model_name = 'nli-distilroberta-base-v2'
model = SentenceTransformer(model_name)

You are comparing 2 different things:您正在比较 2 个不同的事物:

training_stsbenchmark.py - This example shows how to create a SentenceTransformer model from scratch by using a pre-trained transformer model together with a pooling layer. training_stsbenchmark.py - 这个例子展示了如何使用预训练的转换器 model 和池化层从头开始创建 SentenceTransformer model。

In other words, you are creating your own model SentenceTransformer using your own data , therefore fine-tuning.换句话说,您正在使用自己的数据创建自己的 model SentenceTransformer ,因此进行微调。

training_stsbenchmark_continue_training.py - This example shows how to continue training on STS data for a previously created & trained SentenceTransformer model. training_stsbenchmark_continue_training.py - 此示例显示如何继续先前创建和训练SentenceTransformer model 的 STS 数据进行训练。

In that example, they load a model trained on NLI data.在该示例中,他们加载了在 NLI 数据上训练的 model。

So, to answer "wont that always be better than the first method?"那么,回答“这不会总是比第一种方法更好吗?”

It depends on you final results.这取决于你的最终结果。 Try both methods and check for yourself which will deliver better cross-validation results.尝试这两种方法并自行检查,这将提供更好的交叉验证结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM