简体   繁体   English

R.如何确定规则

[英]R. How to identify rules

Does anyone know how to identify rules that show which treatments are typically used for which diseases. 有谁知道如何确定表明哪些疗法通常用于哪些疾病的规则。 I have this data. 我有这个数据。 First column - patient, second - disease, third - medicine. 第一列-患者,第二列-疾病,第三列-药物。

P1  D1  M1  
P1  D2  M1  
P2  D3  M2  M3  
P2  D4  M4  
P2  D1  M5  
P2  D2  M6  M7  M8  
P2  D1  M4  M9  
P2  D8  M10 
P3  D9  M11 

I read the following data with this code 我用这段代码读取了以下数据

t <- read.transactions("data.txt", format="basket", sep="\t", cols=1)
dt = apriori(t, parameter = list(support=0.002, confidence =0.5))
inspect(dt)

First, change your data from a wide to long form. 首先,将数据从宽格式更改为长格式。 You can use reshape() do this. 您可以使用reshape()做到这一点。 Since you didn't provide variable names your code will look something like: 由于您未提供变量名,因此您的代码将如下所示:

reshape(d, direction="long", varying=list(names(d)[2:7]), v.names="Treatment", idvar=c("PatientID"))

When you do that your data will look like: 当您这样做时,您的数据将如下所示:

P1  D1  M1
P1  D2  M1  
P2  D3  M2 
P2  D3  M3
P2  D4  M4  
P2  D1  M5  
P2  D2  M6
P2  D2  M7  
P2  D2  M8    
P2  D1  M4  
P2  D1  M9  
P2  D8  M10 
P3  D9  M11

After you do that you can easily create a 2x2 table to see frequency of treatments by diseases. 完成此操作后,您可以轻松创建2x2表格,以查看疾病的治疗频率。 For this your code will look like: 为此,您的代码将如下所示:

table(d$disease, d$M)

Producing (something like) this: 产生(类似):

     M1 M2 M3
  D1  1  0  0
  D2  1  1  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM