简体   繁体   English

使用Keras进行文本分类

[英]Using Keras for text classification

I am struggling to approach the bag of words / vocabulary method for representing my input data as one hot vectors for my neural net model in keras. 我正在努力接近单词/词汇方法,将我的输入数据表示为我在keras中的神经网络模型的一个热矢量。

I would like to build a simple 3 layer network but I need help in understanding and developing an approach to transform my labelled data in the form of text,sentinment which is has 7 labels, in the range of 0 - 1 in steps of 0.2. 我想建立一个简单的3层网络,但我需要帮助理解和开发一种方法,以文本的形式转换我的标记数据,sentinment有7个标签,范围为0-1,步长为0.2。

I have tried to use scikit's vectorisers but they are too rigid ie they either tokenise words or characters, whereas I need a sentence to be compared to the vocabulary which includes words, characters, punctuation and emojis. 我曾试图使用scikit的矢量化器,但它们过于严格,即它们要么标记单词或字符,而我需要将句子与包含单词,字符,标点符号和表情符号的词汇进行比较。 When i use tfid on a test sentence it only counts the words and ignores everything else. 当我在测试句子上使用tfid时,它只计算单词并忽略其他所有内容。 I also need guidance on taking this one hot approach and how it will be implemented in keras. 我还需要有关采用这一热门方法的指导以及如何在keras中实施。

是一个Keras示例,它有8个输出类并使用一袋单词。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM