简体   繁体   中英

Apriori Algorithm in R

I have what I thought was a well-prepared dataset. I wanted to use the Apriori Algorithm in R to look for associations and come up with some rules. I have about 16,000 rows (unique customers) and 179 columns that represent various items/categories. The data looks like this:

     Cat1  Cat2  Cat3  Cat4  Cat5 ... Cat179
     1,     0,    0,    0,    1,  ...  0
     0,     0,    0,    0,    0,  ...  1
     0,     1,    1,    0,    0,  ...  0
     ...

I thought having a comma separated file with binary values (1/0) for each customer and category would do the trick, but after I read in the data using:

data5 = read.csv("Z:/CUST_DM/data_test.txt",header = TRUE,sep=",")

and then run this command:

rules = apriori(data5, parameter = list(supp = .001,conf = 0.8))

I get the following error:

Error in asMethod(object):
column(s) 1, 2, 3, ...178 not logical or a factor. Discretize the columns first.  

I understand Discretize but not in this context I guess. Everything is a 1 or 0. I've even changed the data from INT to CHAR and received the same error. I also had the customer ID (unique) as column 1 but I understand that isn't necessary when the data is in this form (flat file). I'm sure there is something obvious I'm missing - I'm new to R.

What am I missing? Thanks for your input.

I solved the problem this way: After reading in the data to RI used lapply() to change the data to factors (I think that's what it does). Then I took that data set and created a data frame from it. Then I was able to apply apriori() successfully.

Your data is actually already in (dense) matrix format, but read.csv always reads data in as a data.frame . Just coerce the data to a matrix first:

dat <- as.matrix(data5)
rules <- apriori(dat, parameter = list(supp = .001,conf = 0.8))

1s in the data will be interpreted as the presence of the item and 0s as the absence. More information about how to create transactions can be found in the manual page ? transactions ? transactions .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM