简体   繁体   中英

WEKA Decision Tree with String attributes

I have an attribute set of 20 with few of them being strings such as codes for states in US, names of subscription plans and and so on. How can we handle string attributes in WEKA for decision tree construction?

I read about stringtowordvector converter, but the strings of each of these attributes is just a word by itself.

您可能已经弄清楚了-您必须将这样的“字符串属性”(实际的字符串属性在WEKA中是其他声明)声明为名义属性,即,必须在大括号中声明它们在ARF​​F标头中可以具有的所有值。

Just declare the attribute following this schema in your ARFF file:

@attribute <att_name> string

Be careful because Strings are stored internally in a string table and represented by their address in that table. Thus, two strings that contain the same characters will have the same value.

Source (book): Data Mining: Practical Machine Learning Tools and Techniques 3rd Edition

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM