简体   繁体   中英

tidytext R in spanish - any alternative?

I'm doing sentiment analysis from twitter but my tweets are on Spanish so I can't use tidytext to classify the words. Does anyone know if there is a similar package for Spanish?

I run into the same issue with Non-English textmining. I found udpipe which is an r package developed by Bnosac. It is a Natural Language Processing toolkit that provides language-agnostic 'tokenization', 'parts of speech tagging', 'lemmatization', 'morphological feature tagging' and 'dependency parsing' of raw text. Beware that there are no sentiment tags in the package. Those you will need to find elsewhere.

It supports a diverse range of non-English languages.

You can find out more on their blog , on the webpage of udpipe or on github

PS I have no affiliation with them.

There are not a lot of good open source options for sentiment lexicons in non-English languages right now, unfortunately. You can request the NRC lexicon in other languages from the authors; it is translated by Google Translate (which of course adds uncertainty but has shown to be mostly OK overall) and the authors say they give it away for research purposes but will charge for commercial use.

斯坦福核心 NLP 包在 cran 上,还通过 get_sentiment 函数提供西班牙语的情绪

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM