如何在 Syuzhet 上为 R 使用自定义 NRC 样式的词典？

Question

我是 R 的新手，也是与 Syuzhet 合作的新手。

我正在尝试制作一个自定义的 NRC 样式库以与 Syuzhet package 一起使用，以便对单词进行分类。 不幸的是，虽然这个功能现在存在于 Syuzhet 中，但它似乎无法识别我的自定义词典。 请原谅我奇怪的变量名和额外的库，我计划稍后将它们用于其他东西，我只是在测试东西。

library(sentimentr)
library(pdftools)
library(tm)
library(readxl)
library(syuzhet)
library(tidytext)

texto <- "I am so love hate beautiful ugly"

text_cust <- get_tokens(texto)


custom_lexicon <- data.frame(lang = c("eng","eng","eng","eng"), word = c("love", "hate", "beautiful", "ugly"), sentiment = c("positive","positive","positive","positive"), value = c("1","1","1","1"))


my_custom_values <- get_nrc_sentiment(text_cust, lexicon = custom_lexicon)

我收到以下错误：

my_custom_values <- get_nrc_sentiment(text_cust, lexicon = custom_lexicon)
新名称： • value -> value...4 • value -> value...5 FUN(X[[i]], ...) 中的错误：自定义词典必须包含“单词”、“情感”和一个“价值”列

据我所知，我的数据框与标准 NRC 库的数据框完全匹配，其中包含标记为“单词”、“情感”和“价值”的列。 所以我不确定为什么我会收到这个错误。

Answer 1

syuzhet 的get_nrc_sentiment的 cran 版本不接受词典。 get_sentiment确实如此。 但是您的 custom_lexicon 有错误。 这些值需要是 integer 值，而不是字符值。 并且要使用您自己的词典，您需要将方法设置为“自定义”，否则自定义词典将被忽略。 下面的代码仅适用于 syuzhet。

library(syuzhet)

texto <- "I am so love hate beautiful ugly"

text_cust <- get_tokens(texto)
custom_lexicon <- data.frame(lang = c("eng","eng","eng","eng"), 
                             word = c("love", "hate", "beautiful", "ugly"), 
                             sentiment = c("positive","positive","positive","positive"), 
                             value = c(1,1,1,1))
get_sentiment(text_cust, method = "custom", lexicon = custom_lexicon)    

[1] 0 0 0 1 1 1 1

如何在 Syuzhet 上为 R 使用自定义 NRC 样式的词典？

问题描述

1 个解决方案

解决方案1
0 2022-08-16 09:02:29

如何在 Syuzhet 上为 R 使用自定义 NRC 样式的词典？

问题描述

1 个解决方案

解决方案1 0 2022-08-16 09:02:29

解决方案1
0 2022-08-16 09:02:29