简体   繁体   English

如何应用stringtoword向量过滤器

[英]how to apply stringtoword vector filter

I am trying to use the weka gui to classify some textual data. 我正在尝试使用weka gui对一些文本数据进行分类。

I am using the stringtoword filter with the attribute indices default value being set to first-last. 我正在使用stringtoword过滤器,并将属性索引默认值设置为“倒数第一”。

However, i tried to change it to things such as 1, 500-last 但是,我尝试将其更改为1,500-last

it gives me an error of invalid range list. 它给我一个错误的范围列表错误。

Initially my arff has only 2 attributes. 最初,我的arff只有2个属性。

class
text

Is there anything i am doing wrongly ? 我做错了什么吗?

I am pretty sure there are a lot of words in the text file and when i run the default filter of first-last it gives me a whole 10,000 number of attributes 我很确定文本文件中有很多单词,当我运行默认过滤器时,它为我提供了10,000个属性

The attribute indices takes index, respectively indices of attributes whose values you wish convert to word vector. 属性索引分别获取您希望将其值转换为单词向量的属性的索引。 So you have two attributes class with index 1 and text with index 2. Setting first-last takes both and very likely did nothing with class since it is usually single value, and make a word vector from attribute text. 因此,您有两个具有索引1的属性类和具有索引2的文本。设置first-last会同时使用这两个属性,并且对类没有任何作用,因为它通常是单个值,并从属性文本中构成单词向量。

Cut to the chase, your only options in this case is to use 2 or first-last, but result will be the same. 顺其自然,在这种情况下,您唯一的选择是使用2或倒数第一,但结果将相同。 500 is out of range since you have only 2 attributes. 500超出范围,因为您只有2个属性。

PS. PS。 If you wish use just range of words from obtained word vector, you can use Remove filter and specify indices of columns (words) you wish to remove... 如果您只想使用所获得的单词向量中的单词范围,则可以使用“删除”过滤器并指定要删除的列(单词)的索引...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM