繁体   English   中英

使用glm进行情感预测

[英]Sentiment prediction using glm

我试图使用glm预测情绪并遇到以下问题

  train_data_df <- as.data.frame(as.matrix(train_data))
  log_model <- glm(sentiment ~ word_count, data = train_data_df,   family = binomial)
     > Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

输入“情感”和“ word_count”的数据结构如下

> str(train_data$sentiment[1:2])
List of 2
 $ : num 1
 $ : num 1
> str(train_data$word_count[1:2])
List of 2
 $ :List of 1
  ..    $ :Classes 'term_frequency', 'integer'  Named int [1:24] 3 1 1 1 1 1  1 1 1 3 ...
      .. .. ..- attr(*, "names")= chr [1:24] "and" "bags" "came" "disappointed" ...
 $ :List of 1
  ..    $ :Classes 'term_frequency', 'integer'  Named int [1:22] 2 1 1 1 1 1 1 1 1 1 ...
     .. .. ..- attr(*, "names")= chr [1:22] "and" "anyone" "bed" "comfortable" ...



head(train_data_df[1,])
                   name
2 Planetwise Wipe Pouch
                                                                                                                                                          review
2 it came early and was not disappointed. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it.
  rating
2      5
                                                                                                                                                review_clean
2 it came early and was not disappointed i love planet wise bags and now my wipe holder it keps my osocozy wipes moist and does not leak highly recommend it
                                                              word_count sentiment
2 3, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1         1

在此先感谢您的帮助

在R公式(如您使用的R公式)中, sentiment ~ word_count ,每一面都应是每行一个数字或因子(这是'x' must be atomic意思)。 您的word_count列显然不是这种情况-看来,对于每一行, word_count是一个由几个整数值组成的列表( Have you called 'sort' on a list?在列表word_count Have you called 'sort' on a list? -的确,您有)。

为了确认这是问题的根源,您可以将word_count替换为其元素的总和。 这应该可以使代码正常工作(当然,如果结果对于情感预测具有任何实际价值,则是另外一回事了,但这不是您的实际问题在这里...)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM