R情感分析；找不到“詞典”； “情懷”敗壞？

Question

我試圖按照此對情感分析的在線教程。 編碼：

new_sentiments <- sentiments %>% #From the tidytext package
  filter(lexicon != "loughran") %>% #Remove the finance lexicon
  mutate( sentiment = ifelse(lexicon == "AFINN" & score >= 0, "positive",
                         ifelse(lexicon == "AFINN" & score < 0,
                                "negative", sentiment))) %>%
  group_by(lexicon) %>%
  mutate(words_in_lexicon = n_distinct(word)) %>%
  ungroup()

產生錯誤：

>Error in filter_impl(.data, quo) : 
>Evaluation error: object 'lexicon' not found.

相關的，也許是對我來說，“情緒”表的行為很奇怪（損壞了？）。 這是“情緒”的頭部：

> head(sentiments,3)
>  element_id sentence_id word_count sentiment                                  
> chapter
> 1          1           1          7         0 The First Book of Moses:  
> Called Genesis
> 2          2           1         NA         0 The First Book of Moses:  
> Called Genesis
> 3          3           1         NA         0 The First Book of Moses:  > 
> Called Genesis
>                                  category
> 1 The First Book of Moses:  Called Genesis
> 2 The First Book of Moses:  Called Genesis
> 3 The First Book of Moses:  Called Genesis

但是，如果我對 bing、AFINN 或 NRC 使用 Get_Sentiments，我會得到看起來像適當的響應：

>  get_sentiments("bing")
> # A tibble: 6,788 x 2
>   word        sentiment
>   <chr>       <chr>    >   1 2-faced     negative 
> 2 2-faces     negative 
> 3 a+          positive 
> 4 abnormal    negative

我嘗試刪除 (remove.packages) 並重新安裝 tidytext； 行為沒有變化。 我正在運行 R 3.5

即使我完全誤解了這個問題，我也很感激任何人能給我的任何見解。

Answer 1

以下說明將修復數據營教程中所示的new_sentiments數據集。

bing <- get_sentiments("bing") %>% 
     mutate(lexicon = "bing", 
            words_in_lexicon = n_distinct(word))    

nrc <- get_sentiments("nrc") %>% 
     mutate(lexicon = "nrc", 
            words_in_lexicon = n_distinct(word))

afinn <- get_sentiments("afinn") %>% 
     mutate(lexicon = "afinn", 
            words_in_lexicon = n_distinct(word))

new_sentiments <- bind_rows(bing, nrc, afinn)
names(new_sentiments)[names(new_sentiments) == 'value'] <- 'score'
new_sentiments %>% 
     group_by(lexicon, sentiment, words_in_lexicon) %>% 
     summarise(distinct_words = n_distinct(word)) %>% 
     ungroup() %>% 
     spread(sentiment, distinct_words) %>% 
     mutate(lexicon = color_tile("lightblue", "lightblue")(lexicon), 
            words_in_lexicon = color_bar("lightpink")(words_in_lexicon)) %>% 
     my_kable_styling(caption = "Word Counts per Lexicon")

隨后的圖表也將起作用！

Answer 2

看來tidytext必須更改，這破壞了教程中的一些代碼。

要使代碼運行，請替換

new_sentiments <- sentiments %>% #From the tidytext package
  filter(lexicon != "loughran") %>% #Remove the finance lexicon
  mutate( sentiment = ifelse(lexicon == "AFINN" & score >= 0, "positive",
                              ifelse(lexicon == "AFINN" & score < 0,
                                     "negative", sentiment))) %>%
  group_by(lexicon) %>%
  mutate(words_in_lexicon = n_distinct(word)) %>%
  ungroup()

和

new_sentiments <- get_sentiments("afinn")
names(new_sentiments)[names(new_sentiments) == 'value'] <- 'score'
new_sentiments <- new_sentiments %>% mutate(lexicon = "afinn", sentiment = ifelse(score >= 0, "positive", "negative"),
                                                     words_in_lexicon = n_distinct((word)))

接下來的幾張圖沒有多大意義（因為我們現在只使用一個詞典），但本教程的其余部分將起作用

更新這里是tidytext包作者對發生的事情的一個很好的解釋。

Answer 3

我發現了一個類似的問題，我在下面嘗試了這段代碼，希望它會有所幫助

library(tm)
library(tidyr)
library(ggthemes)
library(ggplot2)
library(dplyr)
library(tidytext)
library(textdata)

# Choose the bing lexicon
get_sentiments("bing")
get_sentiments("afinn")
get_sentiments("nrc")

#define new
afinn=get_sentiments("afinn")
bing=get_sentiments("bing")
nrc=get_sentiments("nrc")

#check
head(afinn)
head(bing)
head(nrc)
head(sentiments) #from tidytext packages

#merging dataframe
merge_sentiments=rbind(sentiments,get_sentiments('bing'),get_sentiments('nrc'))
head(merge_sentiments) #check

merge2_sentiments=merge(merge_sentiments,afinn,by=1,all=T)
head(merge2_sentiments) #check

#make new data frame with column lexicon added
new_sentiments <- merge2_sentiments
new_sentiments <- new_sentiments %>% 
  mutate(lexicon=ifelse(sentiment=='positive','bing',ifelse(sentiment=='negative','bing',ifelse(sentiment=='NA','afinn','nrc'))))

colnames(new_sentiments)[colnames(new_sentiments)=='value']='score'

#check
head(new_sentiments)

R情感分析；找不到“詞典”； “情懷”敗壞？

問題描述

3 個解決方案

解決方案1
2 2019-10-23 04:02:09

解決方案2
1 2019-09-09 03:34:46

解決方案3
0 2020-04-25 22:25:04

R情感分析； 找不到“詞典”； “情懷”敗壞？

問題描述

3 個解決方案

解決方案1 2 2019-10-23 04:02:09

解決方案2 1 2019-09-09 03:34:46

解決方案3 0 2020-04-25 22:25:04

R情感分析；找不到“詞典”； “情懷”敗壞？

解決方案1
2 2019-10-23 04:02:09

解決方案2
1 2019-09-09 03:34:46

解決方案3
0 2020-04-25 22:25:04