简体   繁体   English

无需机器学习、深度学习的文本分类

[英]Text classification without machine learning, deep learning

I want to make a text classifier.我想做一个文本分类器。 But without using any found classification algorithm.Maybe I'll use twitter data for classification.但是没有使用任何找到的分类算法。也许我会使用 twitter 数据进行分类。 Therefore, I have to classify somehow without training data set.因此,我必须在没有训练数据集的情况下以某种方式进行分类。 For now I am thinking of using word frequencies for Classification.I can't find any projects that match my thoughts.现在我正在考虑使用词频进行分类。我找不到任何符合我想法的项目。 I need any project or article Can you help me我需要任何项目或文章你能帮我吗

You can use rule based approaches.您可以使用基于规则的方法。 For example, you can define some keywords to assign to each class.例如,您可以定义一些关键字来分配给每个 class。 But it is clear that we can't define all possible keywords in every class.但很明显,我们不能在每个 class 中定义所有可能的关键字。 So we solve this problem based on machine learning.所以我们基于机器学习来解决这个问题。

Wow, First off, that is a difficult task and given how well machine learning works usually, for tasks like these.哇,首先,这是一项艰巨的任务,并且考虑到机器学习通常对于此类任务的效果如何。 I urge you to try and find training data for your classifier and use machine learning (I find Textblob to be a great and easy to use library for text classification).我敦促您尝试为您的分类器找到训练数据并使用机器学习(我发现 Textblob 是一个很棒且易于使用的文本分类库)。

To answer your question more directly, you really have to think abstract with this one as they are tons of potential things you can try that yield reliable results.为了更直接地回答您的问题,您真的必须对这个问题进行抽象思考,因为它们是您可以尝试产生可靠结果的大量潜在事物。 Though Word2Vec works through machine learning, there are lots of interesting and useful concepts within it.虽然 Word2Vec 通过机器学习工作,但其中包含许多有趣且有用的概念。 See the Wikipedia page here for more details.有关更多详细信息,请参阅此处的 Wikipedia 页面。 For example, you can take a look at "word embeddings".例如,您可以查看“词嵌入”。 Additionally, concepts such as cosine similarity might be of use.此外,可能会使用余弦相似度等概念。

Happy coding!快乐编码!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM