简体繁体中英

Clarify steps to add a language variant to Stanza

原文 2022-07-25 18:54:07 0 1 stanford-nlp

I would like to add a non-standard variant of a language already supported by Stanza. It should be named differently from the standard variety included in the common distribution of Stanza. I could use a modification of the corpus for training the AI, since the changes are mostly morphological rather than syntactical, but how many steps would I need to take in order to make a new language variety for Stanza from this background? I don't understand what data are input and what are output in the process of adding a new language in the web documentation.

1 answers

It sounds like you are trying to add a different set of processors rather than a whole new language. The difference being that other steps of the pipeline will still work the same, right? NER models, for example.

If that's the case, if you can follow the steps to retrain the current models, you should be able to then replace the input data with your morphological updates.

I suggest filing an issue on github if you encounter difficulties in the process. It will be a lot easier to back & forth there.

Times when we would actually recommend a whole new language are when 1) it's actually a new language or 2) it uses a different character set - think different writing systems for ZH or for Punjabi, if we had any Punjabi models

Can I run stanza NER without downloading the language modules?

Is Stanza stanza library very slow

next release of Stanza

Add a language in the Stanford parser

Extract Noun Phrases with Stanza and CoreNLPClient

Stanza: Count words without punctuation

How to optimize memory footprint of Stanza models

Stanza and CoreNLPClient is giving different output for Arabic

How to use lemmatization with the stanza library with dataframe in python?

Stanza throws "KeyError: 'feat_dropout'"

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Can I run stanza NER without downloading the language modules? Is Stanza stanza library very slow next release of Stanza Add a language in the Stanford parser Extract Noun Phrases with Stanza and CoreNLPClient Stanza: Count words without punctuation How to optimize memory footprint of Stanza models Stanza and CoreNLPClient is giving different output for Arabic How to use lemmatization with the stanza library with dataframe in python? Stanza throws "KeyError: 'feat_dropout'"

Related Tags

Clarify steps to add a language variant to Stanza

Question

1 answers

solution1 0 2022-07-26 18:46:59

solution1
0 2022-07-26 18:46:59