[英]Error Package KlaR kmodes : Error: Column index must be at most 5 if positive, not 6
Applying the klaR kmodes algorith to the below dataset 将klaR kmodes算法应用于以下数据集
> summary(raw)
CREDIT_LIMIT CP gender IE_CHILD_NB IE_TOT_DEP_NB TOTAL_INCOME IE_HOUSE_CHARGE maritial
>2000 : 612 11500 : 145 MM: 5435 0:7432 0:1446 >2000 :3524 >2000 : 2 D : 1195
0-500 :10458 11100 : 90 MR:12983 1:4119 1:3748 0-500 :1503 0-500 :17146 M :10507
1000-1500: 2912 08830 : 71 2:5787 2:3386 1000-1500:6649 1000-1500: 44 MISS: 1446
1500-2000: 2254 11406 : 68 3: 947 3:3740 1500-2000:4116 1500-2000: 5 Ot : 1043
500-1000 : 2182 35018 : 66 4: 133 4:6098 500-1000 :2626 500-1000 : 1221 S : 4227
11510 : 62
(Other):17916
new_age job_age
>70 : 295 0-20 :14627
0-30 : 815 20-30: 1986
30-40:4867 30-40: 612
40-50:7293 40-50: 124
50-60:3883 50-60: 1069
60-70:1265
I get the following error 我收到以下错误
> cluster.results <-kmodes(data=raw, modes=4, iter.max = 10, weighted=FALSE )
Error: Column index must be at most 5 if positive, not 6
Any idea about what is the error about? 关于错误的任何想法吗?
Bests 最好的
Partial answer for anyone searching about that error : the error means that somewhere an object is being called to return elements outside it's range, such as more columns than exist, eg: 对于搜索该错误的任何人的部分答案 :错误表示某个对象被调用以返回其范围之外的元素,例如,存在的列多于其他对象,例如:
> aa <- tibble(bb = c(1,2))
> aa
# A tibble: 2 x 1
bb
<dbl>
1 1.00
2 2.00
> aa[,2]
Error: Column index must be at most 1 if positive, not 2
I'm not sure of the source of the error exactly in this case, it doesn't occur with lists and data frames (dfs return undefined columns selected
, and lists return NULL
), and I don't use that package. 在这种情况下,我不确定错误的根源,列表和数据框都不会发生此错误(dfs返回undefined columns selected
,列表返回NULL
),并且我不使用该包。
I experienced the same problem when trying to use kmodes to cluster the following cateforical dataframe: 尝试使用kmode将以下类别数据框聚类时,我遇到了相同的问题:
> summary(raw_df)
Age Years_At_Present_Employment Marital_Status_Gender Dependents Housing Job
(0,20] : 80 A71: 310 A91: 250 1:4225 A151: 895 A171: 110
(20,30]:1975 A72: 860 A92:1550 2: 775 A152:3565 A172:1000
(30,45]:2015 A73:1695 A93:2740 A153: 540 A173:3150
(45,60]: 705 A74: 870 A94: 460 A174: 740
(60,75]: 225 A75:1265
Foreign_Worker Current_Address_Yrs Telephone
A201:4815 Min. :1.000 A191:2980
A202: 185 1st Qu.:2.000 A192:2020
Median :3.000
Mean :2.845
3rd Qu.:4.000
Max. :4.000
Then I got the error 然后我得到了错误
> (raw_clusters <- klaR::kmodes(raw_df, 5))
Error: Column index must be at most 4 if positive, not 6
It seems that this implementation of kmodes (klaR) requires that the categorical variables need to be numerical, so you need to convert the variables from factors into numerical (keeping in mind that they are really categorical) 似乎kmodes(klaR)的这种实现要求分类变量必须是数字变量,因此您需要将变量从因子转换为数字变量(请记住,它们确实是分类变量)
raw_4clust <- raw_df %>%
mutate(
Age = as.numeric(Age),
Years_At_Present_Employment = as.numeric(Years_At_Present_Employment),
Marital_Status_Gender = as.numeric(Marital_Status_Gender),
Housing = as.numeric(Housing),
Job = as.numeric(Job),
Foreign_Worker = as.numeric(Foreign_Worker),
Telephone = as.numeric(Telephone)
)
after that it worked for me. 之后,它对我有用。
Hope that helps 希望能有所帮助
In my case, i have used dplyr for doing data transformation. 就我而言,我已经使用dplyr进行数据转换。 so what I did was converting my object to data frame: 所以我所做的就是将对象转换为数据框:
tmp = as.data.frame(tmp)
And my problem solved. 我的问题解决了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.