[英]Row wise comparison of a dataframe in R
I have a data frame with multiple data points corresponding to each ID.我有一个数据框,其中包含与每个 ID 对应的多个数据点。 When the status value is different between 2 timepoints for an ID, I want to flag the first status change.
当 ID 的 2 个时间点之间的状态值不同时,我想标记第一次状态更改。 How do I achieve that in R?
如何在 R 中实现这一点? Below is a sample dataset.
下面是一个示例数据集。
ID ![]() |
Time![]() |
Status![]() |
---|---|---|
ID1 ![]() |
0 ![]() |
X ![]() |
ID1 ![]() |
6 ![]() |
X ![]() |
ID1 ![]() |
12 ![]() |
Y![]() |
ID1 ![]() |
18 ![]() |
Z ![]() |
Result dataset结果数据集
ID ![]() |
Time![]() |
Status![]() |
Flag![]() |
---|---|---|---|
ID1 ![]() |
0 ![]() |
X ![]() |
|
ID1 ![]() |
6 ![]() |
X ![]() |
|
ID1 ![]() |
12 ![]() |
Y![]() |
1 ![]() |
ID1 ![]() |
18 ![]() |
Z ![]() |
Here is a base R solution with ave
.这是带有
ave
的基本 R 解决方案。 It creates a vector y
that is equal to 1 every time the previous value is different from the current one.每次前一个值与当前值不同时,它都会创建一个等于 1 的向量
y
。 Then the Flag
is computed with diff
.然后使用
diff
计算Flag
。
y <- with(df1, ave(Status, ID, FUN = function(x) c(0, x[-1] != x[-length(x)])))
df1$Flag <- c(0, diff(as.integer(y)) != 0)
df1
# ID Time Status Flag
#1 ID1 0 X 0
#2 ID1 6 X 0
#3 ID1 12 Y 1
#4 ID1 18 Z 0
df1 <- read.table(text = "
ID Time Status
ID1 0 X
ID1 6 X
ID1 12 Y
ID1 18 Z
", header = TRUE)
You can use mutate() with ifelse() and lag(), then replace the non-first Flag==1 with 0s with replace():您可以将 mutate() 与 ifelse() 和 lag() 一起使用,然后用 replace() 将非第一个 Flag==1 替换为 0:
df1%>%group_by(ID)%>%
mutate(Flag=ifelse(is.na(lag(Status)), 0,
as.integer(Time!=lag(Time) & Status!=lag(Status))))%>%
group_by(ID, Flag)%>%
mutate(Flag=replace(Flag, Flag==lag(Flag) & Flag==1, 0))
# A tibble: 4 x 4
# Groups: ID, Flag [2]
ID Time Status Flag
<fct> <int> <fct> <dbl>
1 ID1 0 X 0
2 ID1 6 X 0
3 ID1 12 Y 1
4 ID1 18 Z 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.