如何在 Syuzhet 上為 R 使用自定義 NRC 樣式的詞典？

Question

我是 R 的新手，也是與 Syuzhet 合作的新手。

我正在嘗試制作一個自定義的 NRC 樣式庫以與 Syuzhet package 一起使用，以便對單詞進行分類。 不幸的是，雖然這個功能現在存在於 Syuzhet 中，但它似乎無法識別我的自定義詞典。 請原諒我奇怪的變量名和額外的庫，我計划稍后將它們用於其他東西，我只是在測試東西。

library(sentimentr)
library(pdftools)
library(tm)
library(readxl)
library(syuzhet)
library(tidytext)

texto <- "I am so love hate beautiful ugly"

text_cust <- get_tokens(texto)


custom_lexicon <- data.frame(lang = c("eng","eng","eng","eng"), word = c("love", "hate", "beautiful", "ugly"), sentiment = c("positive","positive","positive","positive"), value = c("1","1","1","1"))


my_custom_values <- get_nrc_sentiment(text_cust, lexicon = custom_lexicon)

我收到以下錯誤：

my_custom_values <- get_nrc_sentiment(text_cust, lexicon = custom_lexicon)
新名稱： • value -> value...4 • value -> value...5 FUN(X[[i]], ...) 中的錯誤：自定義詞典必須包含“單詞”、“情感”和一個“價值”列

據我所知，我的數據框與標准 NRC 庫的數據框完全匹配，其中包含標記為“單詞”、“情感”和“價值”的列。 所以我不確定為什么我會收到這個錯誤。

Answer 1

syuzhet 的get_nrc_sentiment的 cran 版本不接受詞典。 get_sentiment確實如此。 但是您的 custom_lexicon 有錯誤。 這些值需要是 integer 值，而不是字符值。 並且要使用您自己的詞典，您需要將方法設置為“自定義”，否則自定義詞典將被忽略。 下面的代碼僅適用於 syuzhet。

library(syuzhet)

texto <- "I am so love hate beautiful ugly"

text_cust <- get_tokens(texto)
custom_lexicon <- data.frame(lang = c("eng","eng","eng","eng"), 
                             word = c("love", "hate", "beautiful", "ugly"), 
                             sentiment = c("positive","positive","positive","positive"), 
                             value = c(1,1,1,1))
get_sentiment(text_cust, method = "custom", lexicon = custom_lexicon)    

[1] 0 0 0 1 1 1 1

如何在 Syuzhet 上為 R 使用自定義 NRC 樣式的詞典？

問題描述

1 個解決方案

解決方案1
0 2022-08-16 09:02:29

如何在 Syuzhet 上為 R 使用自定義 NRC 樣式的詞典？

問題描述

1 個解決方案

解決方案1 0 2022-08-16 09:02:29

解決方案1
0 2022-08-16 09:02:29