简体   繁体   English

错误包KlaR kmodes:错误:如果正数,列索引必须最大为5,​​而不是6

[英]Error Package KlaR kmodes : Error: Column index must be at most 5 if positive, not 6

Applying the klaR kmodes algorith to the below dataset 将klaR kmodes算法应用于以下数据集

> summary(raw)
    CREDIT_LIMIT         CP        gender     IE_CHILD_NB IE_TOT_DEP_NB    TOTAL_INCOME   IE_HOUSE_CHARGE  maritial    
 >2000    :  612   11500  :  145   MM: 5435   0:7432      0:1446        >2000    :3524   >2000    :    2   D   : 1195  
 0-500    :10458   11100  :   90   MR:12983   1:4119      1:3748        0-500    :1503   0-500    :17146   M   :10507  
 1000-1500: 2912   08830  :   71              2:5787      2:3386        1000-1500:6649   1000-1500:   44   MISS: 1446  
 1500-2000: 2254   11406  :   68              3: 947      3:3740        1500-2000:4116   1500-2000:    5   Ot  : 1043  
 500-1000 : 2182   35018  :   66              4: 133      4:6098        500-1000 :2626   500-1000 : 1221   S   : 4227  
                   11510  :   62                                                                                       
                   (Other):17916                                                                                       
  new_age      job_age     
 >70  : 295   0-20 :14627  
 0-30 : 815   20-30: 1986  
 30-40:4867   30-40:  612  
 40-50:7293   40-50:  124  
 50-60:3883   50-60: 1069  
 60-70:1265              

I get the following error 我收到以下错误

> cluster.results <-kmodes(data=raw, modes=4, iter.max = 10, weighted=FALSE )
Error: Column index must be at most 5 if positive, not 6

Any idea about what is the error about? 关于错误的任何想法吗?

Bests 最好的

Partial answer for anyone searching about that error : the error means that somewhere an object is being called to return elements outside it's range, such as more columns than exist, eg: 对于搜索该错误的任何人的部分答案 :错误表示某个对象被调用以返回其范围之外的元素,例如,存在的列多于其他对象,例如:

> aa <- tibble(bb = c(1,2))
> aa
# A tibble: 2 x 1
     bb
  <dbl>
1  1.00
2  2.00
> aa[,2]
Error: Column index must be at most 1 if positive, not 2

I'm not sure of the source of the error exactly in this case, it doesn't occur with lists and data frames (dfs return undefined columns selected , and lists return NULL ), and I don't use that package. 在这种情况下,我不确定错误的根源,列表和数据框都不会发生此错误(dfs返回undefined columns selected ,列表返回NULL ),并且我不使用该包。

I experienced the same problem when trying to use kmodes to cluster the following cateforical dataframe: 尝试使用kmode将以下类别数据框聚类时,我遇到了相同的问题:

 > summary(raw_df)
  Age       Years_At_Present_Employment Marital_Status_Gender Dependents Housing       Job      
  (0,20] :  80   A71: 310                    A91: 250              1:4225     A151: 895   A171: 110  
  (20,30]:1975   A72: 860                    A92:1550              2: 775     A152:3565   A172:1000  
  (30,45]:2015   A73:1695                    A93:2740                         A153: 540   A173:3150  
  (45,60]: 705   A74: 870                    A94: 460                                     A174: 740  
  (60,75]: 225   A75:1265                                                                            

  Foreign_Worker Current_Address_Yrs Telephone  
  A201:4815      Min.   :1.000       A191:2980  
  A202: 185      1st Qu.:2.000       A192:2020  
                 Median :3.000                  
                 Mean   :2.845                  
                 3rd Qu.:4.000                  
                 Max.   :4.000  

Then I got the error 然后我得到了错误

 > (raw_clusters <- klaR::kmodes(raw_df, 5))
 Error: Column index must be at most 4 if positive, not 6

It seems that this implementation of kmodes (klaR) requires that the categorical variables need to be numerical, so you need to convert the variables from factors into numerical (keeping in mind that they are really categorical) 似乎kmodes(klaR)的这种实现要求分类变量必须是数字变量,因此您需要将变量从因子转换为数字变量(请记住,它们确实是分类变量)

raw_4clust <- raw_df %>% 
                       mutate(
                          Age = as.numeric(Age),
                          Years_At_Present_Employment = as.numeric(Years_At_Present_Employment),
                          Marital_Status_Gender = as.numeric(Marital_Status_Gender),
                          Housing = as.numeric(Housing),
                          Job = as.numeric(Job),
                          Foreign_Worker = as.numeric(Foreign_Worker),
                          Telephone = as.numeric(Telephone)
                                   )

after that it worked for me. 之后,它对我有用。

Hope that helps 希望能有所帮助

In my case, i have used dplyr for doing data transformation. 就我而言,我已经使用dplyr进行数据转换。 so what I did was converting my object to data frame: 所以我所做的就是将对象转换为数据框:

tmp = as.data.frame(tmp)

And my problem solved. 我的问题解决了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM