[英]R sentiment analysis; 'lexicon' not found; 'sentiments' corrupted?
我试图按照此对情感分析的在线教程。 编码:
new_sentiments <- sentiments %>% #From the tidytext package
filter(lexicon != "loughran") %>% #Remove the finance lexicon
mutate( sentiment = ifelse(lexicon == "AFINN" & score >= 0, "positive",
ifelse(lexicon == "AFINN" & score < 0,
"negative", sentiment))) %>%
group_by(lexicon) %>%
mutate(words_in_lexicon = n_distinct(word)) %>%
ungroup()
产生错误:
>Error in filter_impl(.data, quo) :
>Evaluation error: object 'lexicon' not found.
相关的,也许是对我来说,“情绪”表的行为很奇怪(损坏了?)。 这是“情绪”的头部:
> head(sentiments,3)
> element_id sentence_id word_count sentiment
> chapter
> 1 1 1 7 0 The First Book of Moses:
> Called Genesis
> 2 2 1 NA 0 The First Book of Moses:
> Called Genesis
> 3 3 1 NA 0 The First Book of Moses: >
> Called Genesis
> category
> 1 The First Book of Moses: Called Genesis
> 2 The First Book of Moses: Called Genesis
> 3 The First Book of Moses: Called Genesis
但是,如果我对 bing、AFINN 或 NRC 使用 Get_Sentiments,我会得到看起来像适当的响应:
> get_sentiments("bing")
> # A tibble: 6,788 x 2
> word sentiment
> <chr> <chr> > 1 2-faced negative
> 2 2-faces negative
> 3 a+ positive
> 4 abnormal negative
我尝试删除 (remove.packages) 并重新安装 tidytext; 行为没有变化。 我正在运行 R 3.5
即使我完全误解了这个问题,我也很感激任何人能给我的任何见解。
以下说明将修复数据营教程中所示的new_sentiments
数据集。
bing <- get_sentiments("bing") %>%
mutate(lexicon = "bing",
words_in_lexicon = n_distinct(word))
nrc <- get_sentiments("nrc") %>%
mutate(lexicon = "nrc",
words_in_lexicon = n_distinct(word))
afinn <- get_sentiments("afinn") %>%
mutate(lexicon = "afinn",
words_in_lexicon = n_distinct(word))
new_sentiments <- bind_rows(bing, nrc, afinn)
names(new_sentiments)[names(new_sentiments) == 'value'] <- 'score'
new_sentiments %>%
group_by(lexicon, sentiment, words_in_lexicon) %>%
summarise(distinct_words = n_distinct(word)) %>%
ungroup() %>%
spread(sentiment, distinct_words) %>%
mutate(lexicon = color_tile("lightblue", "lightblue")(lexicon),
words_in_lexicon = color_bar("lightpink")(words_in_lexicon)) %>%
my_kable_styling(caption = "Word Counts per Lexicon")
随后的图表也将起作用!
看来tidytext
必须更改,这破坏了教程中的一些代码。
要使代码运行,请替换
new_sentiments <- sentiments %>% #From the tidytext package
filter(lexicon != "loughran") %>% #Remove the finance lexicon
mutate( sentiment = ifelse(lexicon == "AFINN" & score >= 0, "positive",
ifelse(lexicon == "AFINN" & score < 0,
"negative", sentiment))) %>%
group_by(lexicon) %>%
mutate(words_in_lexicon = n_distinct(word)) %>%
ungroup()
和
new_sentiments <- get_sentiments("afinn")
names(new_sentiments)[names(new_sentiments) == 'value'] <- 'score'
new_sentiments <- new_sentiments %>% mutate(lexicon = "afinn", sentiment = ifelse(score >= 0, "positive", "negative"),
words_in_lexicon = n_distinct((word)))
接下来的几张图没有多大意义(因为我们现在只使用一个词典),但本教程的其余部分将起作用
更新这里是tidytext
包作者对发生的事情的一个很好的解释。
我发现了一个类似的问题,我在下面尝试了这段代码,希望它会有所帮助
library(tm)
library(tidyr)
library(ggthemes)
library(ggplot2)
library(dplyr)
library(tidytext)
library(textdata)
# Choose the bing lexicon
get_sentiments("bing")
get_sentiments("afinn")
get_sentiments("nrc")
#define new
afinn=get_sentiments("afinn")
bing=get_sentiments("bing")
nrc=get_sentiments("nrc")
#check
head(afinn)
head(bing)
head(nrc)
head(sentiments) #from tidytext packages
#merging dataframe
merge_sentiments=rbind(sentiments,get_sentiments('bing'),get_sentiments('nrc'))
head(merge_sentiments) #check
merge2_sentiments=merge(merge_sentiments,afinn,by=1,all=T)
head(merge2_sentiments) #check
#make new data frame with column lexicon added
new_sentiments <- merge2_sentiments
new_sentiments <- new_sentiments %>%
mutate(lexicon=ifelse(sentiment=='positive','bing',ifelse(sentiment=='negative','bing',ifelse(sentiment=='NA','afinn','nrc'))))
colnames(new_sentiments)[colnames(new_sentiments)=='value']='score'
#check
head(new_sentiments)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.