简体繁体中英

How do I train gpt 2 from scratch?

原文 2019-12-13 17:57:46 6 1 python/ machine-learning/ nlp/ nlg

I want to train gpt 2 from scratch but there is only fine-tuning approach based on pretrained models in articles I found. I've used this https://github.com/nshepperd/gpt-2 for train with existing model. Should I edit these Python scripts to train from scratch?

1 answers

I found the answer in 'issues' of this repo https://github.com/nshepperd/gpt-2

If you want to not use the released model at all, for instance because you want to train a model with incompatible hyperparameters, it should be sufficient to just skip the restore from the released model checkpoint (around train.py:164-177) on your first run so the parameters will all be randomly initialized.

What parameters do I change to train a pytorch model from scratch?

Train huggingface's GPT2 from scratch : assert n_state % config.n_head == 0 error

How to train a ssd-mobilenet from scratch

How can I train an Alex-Net from scratch using Python?

How can I build a face detection aplication by implementing my own CNN and train it from scratch?

Train Inception from Scratch in TensorFlow

How to train from scratch in TensorFlow object detection API?

How do I debug backprop implementation from scratch?

Train a logistic regression with regularization model from scratch

how can I get the logit values as probabilities from gpt-2?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question What parameters do I change to train a pytorch model from scratch? Train huggingface's GPT2 from scratch : assert n_state % config.n_head == 0 error How to train a ssd-mobilenet from scratch How can I train an Alex-Net from scratch using Python? How can I build a face detection aplication by implementing my own CNN and train it from scratch? Train Inception from Scratch in TensorFlow How to train from scratch in TensorFlow object detection API? How do I debug backprop implementation from scratch? Train a logistic regression with regularization model from scratch how can I get the logit values as probabilities from gpt-2?

Related Tags

How do I train gpt 2 from scratch?

Question

1 answers

solution1 5 ACCPTED 2019-12-16 12:40:09

solution1
5 ACCPTED 2019-12-16 12:40:09