简体   繁体   English

分组数据并生成新列

[英]Group data and generate new column

I have data as follows: 我的数据如下:

1 0 1234
1 0 1235
1 0 5434
2 1 31212
2 1 3212
2 0 1211
3 0 2212
3 0 2212
3 1 1212

What I would like to accomplish using R is to generate a new column, which would have a value of 1 if at least one of the three values (which all belong together) in the second column have a 1. So, my new column would be like: 我想使用R完成的操作是生成一个新列,如果第二列中三个值(都属于一起)中的至少一个具有1,则该列的值为1。因此,我的新列将是像:

1 0 1234 0
1 0 1235 0 
1 0 5434 0
2 1 31212 1
2 1 3212 1
2 0 1211 1
3 0 2212 1
3 0 2212 1
3 1 1212 1

As each 3 rows belong together, I was not able to figure out how to accomplish this. 由于每三行属于同一列,所以我无法弄清楚如何完成此操作。 Could anyone help me with this? 有人可以帮我吗?

You can use dplyr and group_by the first column (V1 in my case), and then use any to check if any of the values equals to 1. 您可以在第一列中使用dplyrgroup_by (在我的情况下为V1),然后使用any检查是否有任何值等于1。

library(dplyr)
df %>% 
   group_by(V1) %>% 
   mutate(new = ifelse(any(V2) == 1, 1, 0))

#Source: local data frame [9 x 4]
#Groups: V1 [3]

#     V1    V2    V3   new
#  <int> <int> <int> <dbl>
#1     1     0  1234     0
#2     1     0  1235     0
#3     1     0  5434     0
#4     2     1 31212     1
#5     2     1  3212     1
#6     2     0  1211     1
#7     3     0  2212     1
#8     3     0  2212     1
#9     3     1  1212     1

We can use ave from base R 我们可以从base R使用ave

df1$new <- with(df1, ave(V2, V1, FUN = any))
df1$new
#[1] 0 0 0 1 1 1 1 1 1

Or using table 或使用table

as.integer(rowSums(table(df1[1:2])!=0)==2)[df1$V1]
#[1] 0 0 0 1 1 1 1 1 1

Or using data.table 或使用data.table

library(data.table)
setDT(df1)[, new := as.integer(any(V2)), by = V1]
df1
#   V1 V2    V3 new
#1:  1  0  1234   0
#2:  1  0  1235   0
#3:  1  0  5434   0
#4:  2  1 31212   1
#5:  2  1  3212   1
#6:  2  0  1211   1
#7:  3  0  2212   1
#8:  3  0  2212   1
#9:  3  1  1212   1

data 数据

df1 <- structure(list(V1 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), V2 = c(0L, 
0L, 0L, 1L, 1L, 0L, 0L, 0L, 1L), V3 = c(1234L, 1235L, 5434L, 
31212L, 3212L, 1211L, 2212L, 2212L, 1212L)), .Names = c("V1", 
"V2", "V3"), class = "data.frame", row.names = c(NA, -9L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM