[英]How to add a unique occasion for different observation dates within an individual in R?
I'm trying to figure out how in my data set to add a column including a count of unique events based on date within each patient. 我试图弄清楚如何在数据集中添加一列,其中包括根据每个患者的日期进行的唯一事件计数。 Here is a part of my data set:
这是我的数据集的一部分:
trialno event date time
3 11301 pm_intake 2010-11-24 19:00
4 11301 am_intake 2010-11-25 07:00
5 11301 pk1 2010-11-25 10:30
6 11301 pm_intake 2010-12-22 19:00
7 11301 am_intake 2010-12-23 07:00
8 11301 pk1 2010-12-23 09:54
9 11301 pk2 2010-12-23 13:07
10 11301 pm_intake 2011-02-02 19:00
11 11301 am_intake 2011-02-03 07:00
12 11301 pk1 2011-02-03 11:30
Basically each date within each patient would indicate a new occasion. 基本上,每个患者的每个日期都表示一个新的场合。 If patient has just drug administration - it's one occasion but if patient had drug administration and two measurements on the same day, they all count as the same occasion.
如果患者仅进行药物给药-这是一次,但是如果患者在同一天进行药物给药和两次测量,则它们都计为同一场合。 The data set does not have a regular patters (each patient has a different number of events on each date and events in total).
数据集没有规律性(每个患者在每个日期都有不同数量的事件,并且事件总数)。 What I'm trying to achieve is:
我想要实现的是:
trialno event date time OCC
3 11301 pm_intake 2010-11-24 19:00 1
4 11301 am_intake 2010-11-25 07:00 2
5 11301 pk1 2010-11-25 10:30 2
6 11301 pm_intake 2010-12-22 19:00 3
7 11301 am_intake 2010-12-23 07:00 4
8 11301 pk1 2010-12-23 09:54 4
9 11301 pk2 2010-12-23 13:07 4
10 11301 pm_intake 2011-02-02 19:00 5
11 11301 am_intake 2011-02-03 07:00 6
12 11301 pk1 2011-02-03 11:30 6
I think I should apply some kind of a loop to identify within each patient unique dates and count them but I'm not sure how to write it, so I tried using apply function. 我认为我应该应用某种循环来在每个患者中识别唯一的日期并计数它们,但是我不确定如何编写它,所以我尝试使用Apply函数。
I thought about splitting the whole data set into individual patients first using split function: 我考虑过先使用split函数将整个数据集拆分为单个患者:
splitData<- split(data, data$trialno)
And applying lapply and transform to add a new column OCC (occasion) but I don't know how to count those as integers... 并应用lapply和transform以添加新列OCC(场合),但我不知道如何将它们视为整数...
I was thinking: 我刚在想:
splitData<- lapply(splitData, function(df) {
transform(df, OCC= ??????????????? )}
do.call ("rbind", splitData)
I know how to do it in Excell: 我知道如何在Excell中执行此操作:
=IF(D5=D4, E4,E4+1)
(if the cell value in neighbouring cell is same as in the cell above, then value in my cell is same as in one above, else it's one greater)-this way first cell in E column has to be 1 and the others are integers of new date events. (如果相邻单元格中的单元格值与上述单元格中的单元格值相同,则我的单元格中的值与上述单元格中的值相同,否则更大)-这样,E列中的第一个单元格必须为1,其他单元格为整数新约会事件。
I tried looking for similar questions on stack overflow but without any luck. 我试图在堆栈溢出时寻找类似的问题,但是没有任何运气。
Help much appreciated! 帮助非常感谢!
If I understand correctly, what you want is for OCC to indicate unique dates for each trial number, but you want to restart OCC at 1 for each new trial number. 如果我理解正确,您想要的是让OCC为每个试用编号指定唯一的日期,但是您希望为每个新的试用编号从1重新启动OCC。 This can be accomplished most easily using the
data.table
package. 使用
data.table
包可以很容易地做到这data.table
。
First I'll generate some data with multiple trial numbers: 首先,我将生成一些具有多个试用编号的数据:
> dt0
trialno event date time
1 11301 pm_intake 2010-11-24 19:00
2 11301 am_intake 2010-11-25 07:00
3 11301 pk1 2010-11-25 10:30
4 11301 pm_intake 2010-12-22 19:00
5 11301 am_intake 2010-12-23 07:00
6 11301 pk1 2010-12-23 09:54
7 11301 pk2 2010-12-23 13:07
8 11301 pm_intake 2011-02-02 19:00
9 11301 am_intake 2011-02-03 07:00
10 11301 pk1 2011-02-03 11:30
11 11302 pk1 2011-02-03 11:30
12 11302 pk1 2011-02-03 11:40
The OCC column can be added like this: 可以这样添加OCC列:
> require(data.table)
> dt<-data.table(dt0)
> dt[,OCC:=match(date,unique(date)),by=trialno]
> dt
trialno event date time OCC
1: 11301 pm_intake 2010-11-24 19:00 1
2: 11301 am_intake 2010-11-25 07:00 2
3: 11301 pk1 2010-11-25 10:30 2
4: 11301 pm_intake 2010-12-22 19:00 3
5: 11301 am_intake 2010-12-23 07:00 4
6: 11301 pk1 2010-12-23 09:54 4
7: 11301 pk2 2010-12-23 13:07 4
8: 11301 pm_intake 2011-02-02 19:00 5
9: 11301 am_intake 2011-02-03 07:00 6
10: 11301 pk1 2011-02-03 11:30 6
11: 11302 pk1 2011-02-03 11:30 1
12: 11302 pk1 2011-02-03 11:40 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.