[英]Sorting Association Rules in R
我正在努力實現下面提到的目標,並且有很多錯誤。 我花了很多時間試圖對規則進行排序,然后打印前十名。 我知道如何打印整個列表。
使用R,探索在較大的數據文件中生成規則。 考慮成人數據(在R中可用> data(Adult)
命令)。 生成關聯規則,置信度閾值為0.8
apriori
函數的外觀參數。 打印按電梯分類的前10條規則。 到目前為止,這是我的代碼:
library(arules)
library(arulesViz)
data(Adult)
head(Adult)
rules <- apriori(Adult, parameter = list(supp = 0.5, conf = 0.8))
top.support <- sort(rules, decreasing = TRUE, na.last = NA, by = "support")
top.ten.support <- sort.list(top.support, partial=10)
inspect(top.ten.support)
top.confidence <- sort(rules, decreasing = TRUE, na.last = NA, by = "confidence")
top.ten.confidence <- sort.list(top.support,partial=10)
inspect(top.ten.confidence)
rules2 <- apriori(Adult, parameter=list(supp = 0.5, conf = 0.8), appearance = income)
top.lift <- sort(rules2, decreasing = TRUE, na.last = NA, by = "lift")
top.ten.lift <- sort.list(top.lift, partial=10)
inspect(top.ten.lift)
1)打印出按支持排序的前10條規則:
R> top.support <- sort(rules, decreasing = TRUE, na.last = NA, by = "support")
R> inspect(head(top.support, 10)) # or inspect(sort(top.support)[1:10])
lhs rhs support confidence lift
1 {} => {capital-loss=None} 0.9533 0.9533 1.0000
2 {} => {capital-gain=None} 0.9174 0.9174 1.0000
3 {} => {native-country=United-States} 0.8974 0.8974 1.0000
4 {capital-gain=None} => {capital-loss=None} 0.8707 0.9491 0.9956
5 {capital-loss=None} => {capital-gain=None} 0.8707 0.9133 0.9956
...
2)打印出滿意排序的前10條規則:
R> top.confidence <- sort(rules, decreasing = TRUE, na.last = NA, by = "confidence")
R> inspect(head(top.confidence, 10))
lhs rhs support confidence lift
1 {hours-per-week=Full-time} => {capital-loss=None} 0.5607 0.9583 1.0052
2 {workclass=Private} => {capital-loss=None} 0.6640 0.9565 1.0034
3 {workclass=Private,
native-country=United-States} => {capital-loss=None} 0.5897 0.9555 1.0023
4 {capital-gain=None,
hours-per-week=Full-time} => {capital-loss=None} 0.5192 0.9551 1.0019
5 {workclass=Private,
race=White} => {capital-loss=None} 0.5675 0.9550 1.0018
...
3)
R> rules2 <- apriori(Adult, parameter=list(supp = 0.1, conf = 0.8),
appearance = list(lhs = c("income=small", "income=large"),
default = "rhs"))
R> top.lift <- sort(rules2, decreasing = TRUE, na.last = NA, by = "lift")
R> inspect(head(subset(top.lift, lhs %pin% "income"), 10))
lhs rhs support confidence lift
1 {income=large} => {marital-status=Married-civ-spouse} 0.1370 0.8535 1.8627
2 {income=large} => {sex=Male} 0.1364 0.8496 1.2710
3 {income=large} => {race=White} 0.1457 0.9077 1.0615
4 {income=small} => {capital-gain=None} 0.4849 0.9581 1.0444
5 {income=large} => {native-country=United-States} 0.1468 0.9146 1.0191
...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.