简体   繁体   中英

How does language model evaluation work with unknown words?

So for building language models, less frequent words ranked beyond vocabulary size are replaced as 'UNK'.

My question is, how to evaluate such language models that evaluates probabilities based on 'UNK'? Say we want to evaluate the perplexity of such a language model on a test set, for words unknown to the model, the probability we get is evaluated based on a 'bag' of unknown words.

This seems problematic because if we set the vocabulary size as 1, ie all words are unknown, then the perplexity of this can-do-nothing language model is going to be 1.

this file explains the question very well:

https://web.stanford.edu/~jurafsky/slp3/4.pdf

in short, perplexity should only be compared between language models with the same vocabulary.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM