简体   繁体   中英

spaCy: Can't find model 'it'

Can you please tell me what I am missing in the code below? I am trying to use some functions defined (at the bottom of the post) that can help me to remove stopwords, form bigrams and doing some lemmatisation. The language is Italian. I am using space for doing so.

!python -m spacy download it_core_news_sm

import spacy
nlp = spacy.load("it_core_news_sm")

data_words_nostops = remove_stopwords(tok_text_list)

# Form Bigrams
data_words_bigrams = make_bigrams(data_words_nostops)

nlp = spacy.load('it', disable=['parser', 'ner'])

# Do lemmatization keeping only noun, adj, vb, adv
data_lemmatized = lemmatization(data_words_bigrams, allowed_postags=['NOUN', 'ADJ', 'VERB', 'ADV'])

print(data_lemmatized[:1])

where

tok_text_list= [['papa',
  ',',
  "l'aspirante",
  'pilota',
  'anni',
  'morto',
  'fiume',
  'tevere',
  'seguito',
  "all'incidente",
  "l'aereo",
  '.',
  'spiaggia',
  'campo',
  'mare',
  'é',
  'vietata',
  'disabili',
  '.'], [...]]

The error that I am getting is:

OSError                                   Traceback (most recent call last)
<ipython-input-216-775b3f412d6f> in <module>

---> 14 nlp = spacy.load('it', disable=['parser', 'ner'])
     15 
     16 # Do lemmatization keeping only noun, adj, vb, adv

    OSError: [E050] Can't find model 'it'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

Maybe I forgot to include something in the code or to download some other file. I also tried to rerun everything as suggested here: Loading the spacy german language model into a jupyter notebook . I am using Jupiter Notebook.

Thanks

!python -m spacy download it

Maybe just install Italian too?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM