[英]Dataframe new column based on other columns and groupby in r
I have a question regarding simple data frame manipulation in R.我对 R 中的简单数据框操作有疑问。 I have the following df table (with more rows of course):
我有以下 df 表(当然还有更多行):
A_ID![]() |
B_ID![]() |
C_ID ![]() |
Value![]() |
---|---|---|---|
1 ![]() |
1 ![]() |
1 ![]() |
2 ![]() |
1 ![]() |
2 ![]() |
1 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
3 ![]() |
0 ![]() |
2 ![]() |
4 ![]() |
3 ![]() |
3 ![]() |
And I would like to have the following table:我想要下表:
A_ID![]() |
B_ID![]() |
C_ID ![]() |
Value![]() |
Value_Equal![]() |
Value_NotEqual ![]() |
---|---|---|---|---|---|
1 ![]() |
1 ![]() |
1 ![]() |
2 ![]() |
2 ![]() |
1 ![]() |
1 ![]() |
2 ![]() |
1 ![]() |
1 ![]() |
2 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
3 ![]() |
0 ![]() |
0 ![]() |
3 ![]() |
2 ![]() |
4 ![]() |
3 ![]() |
3 ![]() |
0 ![]() |
3 ![]() |
So its like a group_by for A_ID, I want to check for each unique(A_ID) if B_ID=C_ID.所以它就像 A_ID 的 group_by,如果 B_ID=C_ID,我想检查每个唯一的(A_ID)。 If this is true, I want to have the Value for Value_Equal(Equal here means B_ID=C_ID), but not only for the row, but rather for the the other row with the same A_ID.
如果这是真的,我希望有 Value_Equal 的值(这里的 Equal 表示 B_ID=C_ID),但不仅适用于该行,还适用于具有相同 A_ID 的另一行。
If its False, I want to have the Value for the column "Value_NotEqual", and again not just for the row, but rather for the other row with the same A_ID..如果它的值为 False,我希望为“Value_NotEqual”列提供值,并且不仅仅是针对该行,而是针对具有相同 A_ID 的另一行。
I hope it is clear what I mean.我希望我的意思很清楚。 If you have any questions regarding my problem task, just ask.
如果您对我的问题任务有任何疑问,请提出。 Thanks in advance!
提前致谢!
Assuming there are only at most one case, grouped by 'A_ID', extract the 'Value' based on the logical expression ( B_ID == C_ID
or B_ID != C_ID
) to create the 'Value_Equal/Value_NotEqual columns)假设最多只有一种情况,按“A_ID”分组,根据逻辑表达式(
B_ID == C_ID
或B_ID != C_ID
)提取“值”以创建“Value_Equal/Value_NotEqual”列)
library(dplyr)
df1 %>%
group_by(A_ID) %>%
mutate(Value_Equal = Value[B_ID == C_ID][1],
Value_NotEqual = Value[B_ID != C_ID][1]) %>%
ungroup
-output -输出
# A tibble: 4 × 6
A_ID B_ID C_ID Value Value_Equal Value_NotEqual
<int> <int> <int> <int> <int> <int>
1 1 1 1 2 2 1
2 1 2 1 1 2 1
3 2 3 3 0 0 3
4 2 4 3 3 0 3
df1 <- structure(list(A_ID = c(1L, 1L, 2L, 2L), B_ID = 1:4, C_ID = c(1L,
1L, 3L, 3L), Value = c(2L, 1L, 0L, 3L)), class = "data.frame",
row.names = c(NA,
-4L))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.