简体   繁体   中英

Problem with Analysing Turkish Text while using stopwords “tr” with R

I am analysing Turkish text in R. But there is a problem when using stopwords"tr" Although, in indicated link, Turkish language is represented with "tr" But it still does not recognize it.

here is the error:

Error: Language "tr" not available in source "snowball". See stopwords_getlanguages for more information on supported languages.

Any help would be appreciated.

You are almost there. You just need to change the source of where the stopwords::stopwords get the language from.

tldr:

For running your code you need:

stopwords::stopwords("tr", source = "stopwords-iso")
[1] "acaba"      "acep"       "adamakıllı" "adeta"      "ait"        "altmýþ"  ... 

Explanation:

These are the languages available in the default source = "snowball"

stopwords::stopwords_getlanguages(source = "snowball")
[1] "da" "de" "en" "es" "fi" "fr" "hu" "ir" "it" "nl" "no" "pt" "ro" "ru" "sv"

To get Turkish you just need to change the source to source = "stopwords-iso" . Below you can see all the stopwords available in this source.

stopwords::stopwords_getlanguages(source = "stopwords-iso")
 [1] "af" "ar" "hy" "eu" "bn" "br" "bg" "ca" "zh" "hr" "cs" "da" "nl" "en" "eo" "et" "fi" "fr" "gl" "de" "el" "ha" "he" "hi" "hu" "id" "ga"
[28] "it" "ja" "ko" "ku" "la" "lt" "lv" "ms" "mr" "no" "fa" "pl" "pt" "ro" "ru" "sk" "sl" "so" "st" "es" "sw" "sv" "th" "tl" "tr" "uk" "ur"
[55] "vi" "yo" "zu"

Which means that for running your code you need:

stopwords::stopwords("tr", source = "stopwords-iso")
[1] "acaba"      "acep"       "adamakıllı" "adeta"      "ait"        "altmýþ"  ... 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM