R- 有没有办法通过提升来限制先验规则？

Question

I'm looking at this data set: https://archive.ics.uci.edu/ml/machine-learning-databases/credit-screening/crx.data我正在查看这个数据集： https : //archive.ics.uci.edu/ml/machine-learning-databases/credit-screening/crx.data

I preprocessed the data:我对数据进行了预处理：

ca.1<-read.csv("CreditApproval.csv",T,",")

# From http://stackoverflow.com/q/4787332/
remove_outliers <- function(x, na.rm = TRUE, ...) {
  qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...)
  H <- 1.5 * IQR(x, na.rm = na.rm)
  y <- x
  y[x < (qnt[1] - H)] <- NA
  y[x > (qnt[2] + H)] <- NA
  y
}

ca.1$A2<-remove_outliers(ca$A2)
ca.1$A3<-remove_outliers(ca$A3)
ca.1$A8<-remove_outliers(ca$A8)
ca.1$A11<-remove_outliers(ca$A11)
ca.1$A14<-remove_outliers(ca$A14)
ca.1$A15<-remove_outliers(ca$A15)
ca.1$A2<-discretize(ca.1$A2,"frequency",categories = 6)
ca.1$A3<-discretize(ca.1$A3,"frequency",categories = 6)
ca.1$A8<-discretize(ca.1$A8,"frequency",categories = 6)
ca.1$A11<-discretize(ca.1$A11,"frequency",categories = 6)
ca.1$A14<-discretize(ca.1$A14,"frequency",categories = 6)
ca.1$A15<-discretize(ca.1$A15,"frequency",categories = 6)

ca.1<-na.omit(ca.1)

After fine tuning the support, confidence, min/maxlen I'm still getting 65 rules:在微调支持度、置信度、最小/最大长度后，我仍然得到 65 条规则：

> rules<-apriori(ca.1, parameter= list(supp=0.15, conf=0.89, minlen=3, maxlen=4), appearance=list(rhs=c("class=-", "class=+"), default="lhs"))
> rules.sorted <- sort(rules, by="lift")
> inspect(rules.sorted)
     lhs                     rhs       support   confidence lift    
[1]  {A5=g,A9=t,A10=t}    => {class=+} 0.1521739 0.8974359  2.770607
[2]  {A4=u,A9=t,A10=t}    => {class=+} 0.1521739 0.8974359  2.770607
[3]  {A1=a,A9=f}          => {class=-} 0.1717391 0.9753086  1.442579
[4]  {A1=a,A9=f,A13=g}    => {class=-} 0.1608696 0.9736842  1.440176
...[65]

As you can see + rules have a greater lift, but less support and confidence than the - rules.正如您所看到的， +规则比-规则有更大的提升，但支持和信心更少。 I've been looking through the docs, and can't find any parameter to limit by lift.我一直在查看文档，但找不到任何要通过提升来限制的参数。 Is this possible?这可能吗？ If not, what do you do in situations like this?如果没有，在这种情况下你会怎么做？

Answer 1

In arules package a special function to subset this object type is defined.在arules包中定义了一个特殊的函数来子集这个对象类型。 In order to filter out rules with lift value less than 2 you can try the following:为了过滤掉提升值小于 2 的规则，您可以尝试以下操作：

subset(rules, subset = lift > 2)

Answer 2

You can't limit apriori rules by lift alone.您不能仅通过提升来限制先验规则。 You have to get a limit by support and confidence first which you did here:您必须首先获得支持和信心的限制，您在这里所做的：

 rules<-apriori(ca.1, parameter= list(supp=0.15, conf=0.89, minlen=3, maxlen=4)

Then after that, do something like this然后在那之后，做这样的事情

rulesLift <- sort(subset(rules, subset = lift < 2), by="lift") 
inspect(rulesLift)

Answer 3

Another way is to use arules::quality() .另一种方法是使用arules::quality() 。 For example:例如：

association.rules <- apriori(tr, parameter = list(support=0.005, confidence=0.25, minlen=3, maxlen=10))

subRules<-association.rules[quality(association.rules)$lift > 1]

This function can filter by support, confidence, coverage, lift, count .此功能可以按support, confidence, coverage, lift, count进行过滤。

Answer 4

I think apriori function does not take lift as one of the parameter.我认为先验函数不会将提升作为参数之一。 I get this error if I try to set lift如果我尝试设置电梯，我会收到此错误

Error: Invalid parameter: lift错误：无效参数：lift

Instead I could sort the rules by lift and pick the rules based on the lift value as follows相反，我可以按提升对规则进行排序，然后根据提升值选择规则，如下所示

sort (rules, by="lift", decreasing=TRUE)排序（规则，by="lift"，递减=TRUE）

This is not a straightforward solution but a decent workaround这不是一个简单的解决方案，而是一个不错的解决方法

Answer 5

What if you tried:如果您尝试过会怎样：

apriori(df, parameter = list(lift = 0.3, minlen =2))

You can set your minimum lift to anything in this case, just chose 0.3.在这种情况下，您可以将最小提升设置为任何值，只需选择 0.3。

R- 有没有办法通过提升来限制先验规则？

问题描述

5 个解决方案

解决方案1
3 2019-02-15 16:58:27

解决方案2
1 2019-04-08 11:19:43

解决方案3
1 2020-06-03 18:27:01

解决方案4
0 2018-04-12 15:55:27

解决方案5
-1 2017-03-22 14:43:41

R- 有没有办法通过提升来限制先验规则？

问题描述

5 个解决方案

解决方案1 3 2019-02-15 16:58:27

解决方案2 1 2019-04-08 11:19:43

解决方案3 1 2020-06-03 18:27:01

解决方案4 0 2018-04-12 15:55:27

解决方案5 -1 2017-03-22 14:43:41

解决方案1
3 2019-02-15 16:58:27

解决方案2
1 2019-04-08 11:19:43

解决方案3
1 2020-06-03 18:27:01

解决方案4
0 2018-04-12 15:55:27

解决方案5
-1 2017-03-22 14:43:41