简体   繁体   English

项集中的规则挖掘算法是什么

[英]what algorithm for rules mining in itemsets

I have following data, every entry contains an itemset and to which class it belongs to (positive or negative).我有以下数据,每个条目都包含一个项集以及它属于哪个类(正或负)。

What algorithm I can use to find out that what combination of items indicate positive or negative?我可以使用什么算法来找出哪些项目组合表示正面或负面?

In the following case, I want to find out that (B, C) indicate positive and (D, E) indicate negative.在下面的情况下,我想找出(B,C)表示正,(D,E)表示负。

B, C, A -> positive B、C、 A -> 阳性

B, C, D -> positive B、C、 D -> 阳性

B, C, E -> positive B、C、 E -> 阳性

B, D, E -> negative B、 D、E -> 阴性

C, D, E -> negative C、 D、E -> 阴性

A, D, E -> negative A、 D、E -> 阴性

result: (B, C) indicate positive, (D, E) indicate negative.结果:(B,C)表示阳性,(D,E)表示阴性。

I've tried frequent itemsets and apriori, result is not good, is there any other possible method?我试过频繁项集和apriori,结果不好,还有其他可能的方法吗?

One typical algorithm could be mapping each pair of items in each record ( itemset ) into its positive or negative class and then count the number of mappings to either positive nor negative classes and compare the results to know which number is greater.一个典型的算法可以被映射每个对items中的每个记录( itemset )到其positivenegative类,然后计数映射到任意的数目positive也不negative的类和比较结果知道哪个号码也更大。 That's the class you are looking for each pair.这就是您要寻找的每一对类。

It's very costly especially when your itemsets have large number of items reside in so, generally, you need some sort of data structures to store and retrieve data in fast and efficient way.这是非常昂贵的,尤其是当您的itemsets包含大量项目时,通常,您需要某种数据结构来以快速有效的方式存储和检索数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM