简体   繁体   中英

How collect members of a column based on the value of a specific member in that column in R

In the following data frame, I want to collect members of B1, where their value in B2 is equal to or more than the value of "b" in B2. And then after this new information, count how many times each of the B1 members occurred.

dataframe:

ID  B1  B2
z1  a   2.5
z1  b   1.7
z1  c   170
z1  c   9
z1  d   3
y2  a   0
y2  b   21
y2  c   15
y2  c   101
y2  d   30
y2  d   3
y2  d   15.5
x3  a   30.8
x3  a   54
x3  a   0
x3  b   30.8
x3  c   30.8
x3  d   7

so the result would be:

ID  B1  B2
z1  a   2.5
z1  c   170
z1  c   9
z1  d   3
y2  c   101
y2  d   30
x3  a   30.8
x3  a   54
x3  c   30.8

and

ID  B1  count
z1  a   1
z1  c   2
z1  d   1
y2  a   0
y2  c   1
y2  d   1
x3  a   2
x3  c   1
x3  d   0

Grouped by 'ID', filter where the 'B2' is greater than or equal to 'B2' where 'B1' is 'b' as well as create another condition where 'B1' is not equal to 'b'

library(dplyr)
out1 <- df1 %>%
    group_by(ID) %>% 
    filter(any(B1 == "b") & B2 >= min(B2[B1 == "b"]), B1 != 'b') 

-output

> out1
# A tibble: 9 × 3
# Groups:   ID [3]
  ID    B1       B2
  <chr> <chr> <dbl>
1 z1    a       2.5
2 z1    c     170  
3 z1    c       9  
4 z1    d       3  
5 y2    c     101  
6 y2    d      30  
7 x3    a      30.8
8 x3    a      54  
9 x3    c      30.8

The second output will be do a group by with summarise to get the number of rows, and then fill the missing combinations with complete

library(tidyr)
out1 %>% 
  group_by(B1, .add = TRUE) %>%
  summarise(count = n(), .groups = "drop_last") %>% 
  complete(B1 = unique(.$B1), fill = list(count = 0)) %>%
  ungroup
# A tibble: 9 × 3
  ID    B1    count
  <chr> <chr> <int>
1 x3    a         2
2 x3    c         1
3 x3    d         0
4 y2    a         0
5 y2    c         1
6 y2    d         1
7 z1    a         1
8 z1    c         2
9 z1    d         1

data

df1 <- structure(list(ID = c("z1", "z1", "z1", "z1", "z1", "y2", "y2", 
"y2", "y2", "y2", "y2", "y2", "x3", "x3", "x3", "x3", "x3", "x3"
), B1 = c("a", "b", "c", "c", "d", "a", "b", "c", "c", "d", "d", 
"d", "a", "a", "a", "b", "c", "d"), B2 = c(2.5, 1.7, 170, 9, 
3, 0, 21, 15, 101, 30, 3, 15.5, 30.8, 54, 0, 30.8, 30.8, 7)), 
class = "data.frame", row.names = c(NA, 
-18L))

Using tidyverse:

library(tidyverse)

df %>% 
  group_by(ID) %>% 
  filter(B2 > B2[B1 == "b"]) %>%
  group_by(ID, B1) %>%
  count(name = "count") %>%
  as.data.frame()
#>   ID B1 count
#> 1 x3  a     1
#> 2 y2  c     1
#> 3 y2  d     1
#> 4 z1  a     1
#> 5 z1  c     2
#> 6 z1  d     1

Created on 2022-04-26 by the reprex package (v2.0.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM