如何应用stringtoword向量过滤器

Question

I am trying to use the weka gui to classify some textual data. 我正在尝试使用weka gui对一些文本数据进行分类。

I am using the stringtoword filter with the attribute indices default value being set to first-last. 我正在使用stringtoword过滤器，并将属性索引默认值设置为“倒数第一”。

However, i tried to change it to things such as 1, 500-last 但是，我尝试将其更改为1，500-last

it gives me an error of invalid range list. 它给我一个错误的范围列表错误。

Initially my arff has only 2 attributes. 最初，我的arff只有2个属性。

class
text

Is there anything i am doing wrongly ? 我做错了什么吗？

I am pretty sure there are a lot of words in the text file and when i run the default filter of first-last it gives me a whole 10,000 number of attributes 我很确定文本文件中有很多单词，当我运行默认过滤器时，它为我提供了10,000个属性

Answer 1

The attribute indices takes index, respectively indices of attributes whose values you wish convert to word vector. 属性索引分别获取您希望将其值转换为单词向量的属性的索引。 So you have two attributes class with index 1 and text with index 2. Setting first-last takes both and very likely did nothing with class since it is usually single value, and make a word vector from attribute text. 因此，您有两个具有索引1的属性类和具有索引2的文本。设置first-last会同时使用这两个属性，并且对类没有任何作用，因为它通常是单个值，并从属性文本中构成单词向量。

Cut to the chase, your only options in this case is to use 2 or first-last, but result will be the same. 顺其自然，在这种情况下，您唯一的选择是使用2或倒数第一，但结果将相同。 500 is out of range since you have only 2 attributes. 500超出范围，因为您只有2个属性。

PS. PS。 If you wish use just range of words from obtained word vector, you can use Remove filter and specify indices of columns (words) you wish to remove... 如果您只想使用所获得的单词向量中的单词范围，则可以使用“删除”过滤器并指定要删除的列（单词）的索引...

如何应用stringtoword向量过滤器

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-05-15 07:21:44

如何应用stringtoword向量过滤器

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-05-15 07:21:44

解决方案1
1 已采纳 2014-05-15 07:21:44