為什么 nrc 情緒實際上是消極的，卻表現出積極的情緒？ R

Question

我正在使用來自以下網站的評論進行情緒分析。

https://www.yelp.com/biz/24th-st-pizzeria-san-antonio?osq=Worst+Restaurant

很明顯，客戶不滿意，但是，當獲得這種情緒時，它會不斷顯示出積極的情緒。 是否有我遺漏的任何數據預處理或我使用的代碼有問題？ 我怎樣才能更准確地了解情緒？ 我尋找可以執行此操作的其他軟件包，但似乎 syuzhet 是唯一具有此功能的軟件包。

library(syuzhet)
library(plotly)

df=read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/reviews.csv')

sentReviews <- iconv(df$Text)

# get user emotions using the NRC dictionary
nrc_emotions <- get_nrc_sentiment(sentReviews)
head(nrc_emotions)

# Build a df for emotions using column sums
emo_bar = colSums(nrc_emotions)
emo_sum = data.frame(count=emo_bar, emotion=names(emo_bar))
emo_sum

# Prepare to graph by ordering from highest to lowest
emo_sum$emotion = factor(emo_sum$emotion, levels=emo_sum$emotion[order(emo_sum$count, decreasing = TRUE)])


plot_ly(emo_sum, x=~emotion, y=~count, type="bar", color=~emotion) %>%
  layout(xaxis=list(title=""), showlegend=FALSE,
         title="Distribution of emotion categories")

Answer 1

有幾個包可以進行情緒分析，比如sentimentr 。 當您使用 nrc 字典作為查找表時，可以使用其他包，如tidytext和quanteda （甚至tm ）。

使用syuzhet和get_nrc_sentiment ，您不會獲得情緒分數。 你會得到一個表格，告訴你它在文本中找到了 nrc 的哪個詞以及找到的頻率。 最后你會得到負面詞和正面詞的數量。

評論的前六行表明：

head(nrc_emotions)
  anger anticipation disgust fear joy sadness surprise trust negative positive
1     0            0       0    0   1       0        0     1        1        1
2     2            2       1    1   1       0        1     2        4        3
3     4            2       4    3   2       1        2     2        6        4
4     2            1       2    1   3       2        1     3        4        3
5     1            2       1    1   1       1        2     1        1        3
6     1            0       2    2   1       2        1     1        5        3

如果您查看第二條評論，您總共有 4 個否定詞和 3 個肯定詞。 要從中獲得情緒，您可以進行多項計算，1 只需從積極因素中減去消極因素。 任何低於 0 的都是負面評論，數字越大越負面。 或者你可以使用它，但將它除以正面和負面的總和，得到 1 到 -1 之間的分數。 它越接近-1，評論越負面。

library(dplyr)

nrc_emotions %>% 
  mutate(sentiment1 = positive - negative, 
         sentiment2 = (positive - negative) / (positive + negative)) %>% 
  head()

  anger anticipation disgust fear joy sadness surprise trust negative positive sentiment1 sentiment2
1     0            0       0    0   1       0        0     1        1        1          0  0.0000000
2     2            2       1    1   1       0        1     2        4        3         -1 -0.1428571
3     4            2       4    3   2       1        2     2        6        4         -2 -0.2000000
4     2            1       2    1   3       2        1     3        4        3         -1 -0.1428571
5     1            2       1    1   1       1        2     1        1        3          2  0.5000000
6     1            0       2    2   1       2        1     1        5        3         -2 -0.2500000

當然，僅使用 syuzhet 包中的head(get_sentiment(sentReviews, method = "nrc"))將返回與使用正 - 負相同的分數。

但是請務必通過syuzhet甚至sentimentr或其他包（如sentiment.ai和tardis ）檢查使用不同方法獲得的幾個情緒分數。 根據所使用的評分函數，它們都會給出略有不同的情緒分數，是否考慮了價態變化（即否定（如不這樣的詞）、放大器（非常）等）。 情緒分析是否適合這份工作。 有些不太適合發推文，有些則適合，

為什么 nrc 情緒實際上是消極的，卻表現出積極的情緒？ R

問題描述

1 個解決方案

解決方案1
0 2022-12-24 10:04:19

為什么 nrc 情緒實際上是消極的，卻表現出積極的情緒？ R

問題描述

1 個解決方案

解決方案1 0 2022-12-24 10:04:19

解決方案1
0 2022-12-24 10:04:19