简体   繁体   English

如何在mahout中使用朴素贝叶斯分类器创建自己的情感分析模型?

[英]How to create our own model for sentiment analysis using naive bayes classifier in mahout?

I am a beginner in mahout. 我是mahout的初学者。 i don't know how to create our own model for sentiment analysis using naive bayes classifier in mahout. 我不知道如何使用mahout中的朴素贝叶斯分类器来创建我们自己的情感分析模型。 I want to create my own model to do sentiment analysis on top of some log data. 我想创建自己的模型以在一些日志数据的基础上进行情感分析。 Is there a step by step procedure for doing this. 有分步执行此操作的过程。 Like what are the classes we have to implement and how to create model or how we can use existing models in mahout. 就像我们必须实现的类以及如何创建模型或如何在mahout中使用现有模型一样。 Any help would be appreciated. 任何帮助,将不胜感激。 Thanks in advance. 提前致谢。

You can see in this presentation a good guide about the steps to follow for an analysis using Naive Bayes classifier in Mahou. 您可以在此演示文稿中看到有关在Mahou中使用Naive Bayes分类器进行分析的步骤的良好指南。 It's kind an step by step procedure. 这是一个循序渐进的过程。

A Deep Dive into Classification with Naive Bayes. 与朴素贝叶斯深入探讨分类。 Along the way we take a look at some basics from Ian Witten's Data Mining book and dig into the algorithm.... 在此过程中,我们看了Ian Witten的Data Mining书中的一些基础知识,并深入研究了算法。

So did you look at the quickstart ? 那么,您是否看过快速入门

The first step is to asses your corpus. 第一步是评估您的语料库。 How is your log data labeled? 日志数据如何标记? How much data do you have? 您有多少数据? If you have a labeled corpus then just follow the quickstart and substitue your corpus for the ones in the example. 如果您有标记的语料库,则只需按照快速入门,将您的语料库替换为示例中的语料库。

Before you even start to write code you have to have a high quality corpus. 在甚至开始编写代码之前,您必须拥有一个高质量的语料库。 Ensure that your examples are balanced and that you have enough data to train on. 确保示例平衡,并且有足够的数据可以进行培训。 You can take a look at some research corpora for a general idea on what is required to train. 您可以查看一些研究语料库,以获取有关训练所需知识的一般想法。 I would sugest the Reuters-21578 corpus or if you can get it the RCV-1 corpus. 我建议使用Reuters-21578语料库,或者如果您能得到RCV-1语料库。

您可以查看以下博客,在此详细说明了逐步过程: http : //instantjavasolutions.blogspot.in/2014/10/how-to-train-my-own-model-using-mahout.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM