I have an attribute set of 20 with few of them being strings such as codes for states in US, names of subscription plans and and so on. How can we handle string attributes in WEKA for decision tree construction?
I read about stringtowordvector converter, but the strings of each of these attributes is just a word by itself.
您可能已经弄清楚了-您必须将这样的“字符串属性”(实际的字符串属性在WEKA中是其他声明)声明为名义属性,即,必须在大括号中声明它们在ARFF标头中可以具有的所有值。
Just declare the attribute following this schema in your ARFF file:
@attribute <att_name> string
Be careful because Strings
are stored internally in a string table and represented by their address in that table. Thus, two strings that contain the same characters will have the same value.
Source (book): Data Mining: Practical Machine Learning Tools and Techniques 3rd Edition
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.