为什么 nrc 情绪实际上是消极的，却表现出积极的情绪？ R

Question

我正在使用来自以下网站的评论进行情绪分析。

https://www.yelp.com/biz/24th-st-pizzeria-san-antonio?osq=Worst+Restaurant

很明显，客户不满意，但是，当获得这种情绪时，它会不断显示出积极的情绪。 是否有我遗漏的任何数据预处理或我使用的代码有问题？ 我怎样才能更准确地了解情绪？ 我寻找可以执行此操作的其他软件包，但似乎 syuzhet 是唯一具有此功能的软件包。

library(syuzhet)
library(plotly)

df=read.csv('https://raw.githubusercontent.com/bandcar/Examples/main/reviews.csv')

sentReviews <- iconv(df$Text)

# get user emotions using the NRC dictionary
nrc_emotions <- get_nrc_sentiment(sentReviews)
head(nrc_emotions)

# Build a df for emotions using column sums
emo_bar = colSums(nrc_emotions)
emo_sum = data.frame(count=emo_bar, emotion=names(emo_bar))
emo_sum

# Prepare to graph by ordering from highest to lowest
emo_sum$emotion = factor(emo_sum$emotion, levels=emo_sum$emotion[order(emo_sum$count, decreasing = TRUE)])


plot_ly(emo_sum, x=~emotion, y=~count, type="bar", color=~emotion) %>%
  layout(xaxis=list(title=""), showlegend=FALSE,
         title="Distribution of emotion categories")

Answer 1

有几个包可以进行情绪分析，比如sentimentr 。 当您使用 nrc 字典作为查找表时，可以使用其他包，如tidytext和quanteda （甚至tm ）。

使用syuzhet和get_nrc_sentiment ，您不会获得情绪分数。 你会得到一个表格，告诉你它在文本中找到了 nrc 的哪个词以及找到的频率。 最后你会得到负面词和正面词的数量。

评论的前六行表明：

head(nrc_emotions)
  anger anticipation disgust fear joy sadness surprise trust negative positive
1     0            0       0    0   1       0        0     1        1        1
2     2            2       1    1   1       0        1     2        4        3
3     4            2       4    3   2       1        2     2        6        4
4     2            1       2    1   3       2        1     3        4        3
5     1            2       1    1   1       1        2     1        1        3
6     1            0       2    2   1       2        1     1        5        3

如果您查看第二条评论，您总共有 4 个否定词和 3 个肯定词。 要从中获得情绪，您可以进行多项计算，1 只需从积极因素中减去消极因素。 任何低于 0 的都是负面评论，数字越大越负面。 或者你可以使用它，但将它除以正面和负面的总和，得到 1 到 -1 之间的分数。 它越接近-1，评论越负面。

library(dplyr)

nrc_emotions %>% 
  mutate(sentiment1 = positive - negative, 
         sentiment2 = (positive - negative) / (positive + negative)) %>% 
  head()

  anger anticipation disgust fear joy sadness surprise trust negative positive sentiment1 sentiment2
1     0            0       0    0   1       0        0     1        1        1          0  0.0000000
2     2            2       1    1   1       0        1     2        4        3         -1 -0.1428571
3     4            2       4    3   2       1        2     2        6        4         -2 -0.2000000
4     2            1       2    1   3       2        1     3        4        3         -1 -0.1428571
5     1            2       1    1   1       1        2     1        1        3          2  0.5000000
6     1            0       2    2   1       2        1     1        5        3         -2 -0.2500000

当然，仅使用 syuzhet 包中的head(get_sentiment(sentReviews, method = "nrc"))将返回与使用正 - 负相同的分数。

但是请务必通过syuzhet甚至sentimentr或其他包（如sentiment.ai和tardis ）检查使用不同方法获得的几个情绪分数。 根据所使用的评分函数，它们都会给出略有不同的情绪分数，是否考虑了价态变化（即否定（如不这样的词）、放大器（非常）等）。 情绪分析是否适合这份工作。 有些不太适合发推文，有些则适合，

为什么 nrc 情绪实际上是消极的，却表现出积极的情绪？ R

问题描述

1 个解决方案

解决方案1
0 2022-12-24 10:04:19

为什么 nrc 情绪实际上是消极的，却表现出积极的情绪？ R

问题描述

1 个解决方案

解决方案1 0 2022-12-24 10:04:19

解决方案1
0 2022-12-24 10:04:19