[英]How to select only specific unique values from a range of columns which are associated with an ID in another column using R
ID conditionA conditionB conditionC
1 1 0 0
1 0 0 1
1 0 0 0
2 1 0 1
2 0 1 0
3 1 0 1
3 0 1 0
3 1 1 0
in the picture above, I want for each ID only single value of each condition, making it a single row for each ID.在上图中,我希望每个 ID 只有每个条件的单个值,使其成为每个 ID 的一行。 This way I can have one row for each ID and under each condition a 1 or 0. Thanks
这样我就可以为每个 ID 分配一行,并且在每个条件下为 1 或 0。谢谢
This can be easily done by using the dplyr package.这可以通过使用 dplyr 包轻松完成。
library(dplyr)
data %>%
group_by(ID) %>%
summarize(
conditionA = max(conditionA),
conditionB = max(conditionB),
conditionC = max(conditionC)
)
The group_by()
will group by ID
, then the summarize()
fnction will coalesce all rows under that ID
to a single one. group_by()
将按ID
分组,然后summarize()
函数将将该ID
下的所有行合并为一个。 conditionA
will assume the maximum value found in all rows for that ID, that is, if a 1 is present, then it will be one; conditionA
将假定在该 ID 的所有行中找到的最大值,即,如果存在 1,则它将为 1; if only 0s are present, then the maximum will be zero.如果仅存在 0,则最大值为零。 Same for
conditionB
, and conditionC
. conditionB
和conditionC
相同。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.