I have a set of data frame as below:
ID | Parameter | value |
---|---|---|
123-01 | a1 | x |
123-02 | a1 | x |
123-01 | b3 | x |
123-02 | b3 | x |
124-01 | a1 | x |
125-01 | a1 | x |
126-01 | a1 | x |
124-01 | b3 | x |
125-01 | b3 | x |
126-01 | b3 | x |
I would like to find the sampleID that ended with "-02", and calculate the difference of the same sample ID that has the same first three digit by same parameter.
For example, calculate the difference of 123-01 and 123-02 based on parameter a1. Then the difference of 123-01 and 123-02 based on parameter b3, etc....
In the end, I can get a table contains
ID | Parameter | DiffValue |
---|---|---|
123 | a1 | y |
123 | b3 | y |
127 | a1 | y |
127 | b3 | y |
How can I do it?
I tried to use dplyr (filter) to create a table that only contains the duplicate, and then how do I match the origin table and do the calculation?
try to do it this way
library(tidyverse)
df <- read.table(text = "ID Parameter value
123-01 a1 10
123-02 a1 10
123-01 b3 10
123-02 b3 10
124-01 a1 10
125-01 a1 10
126-01 a1 10
124-01 b3 10
125-01 b3 10
126-01 b3 10", header = T)
df %>%
arrange(Parameter, ID) %>%
separate(ID, into = c("id_grp", "n"), sep = "-", remove = F) %>%
group_by(Parameter, id_grp) %>%
mutate(diff_value = c(NA, diff(value))) %>%
select(-c(id_grp, n))
#> Adding missing grouping variables: `id_grp`
#> # A tibble: 10 x 5
#> # Groups: Parameter, id_grp [8]
#> id_grp ID Parameter value diff_value
#> <chr> <chr> <chr> <int> <int>
#> 1 123 123-01 a1 10 NA
#> 2 123 123-02 a1 10 0
#> 3 124 124-01 a1 10 NA
#> 4 125 125-01 a1 10 NA
#> 5 126 126-01 a1 10 NA
#> 6 123 123-01 b3 10 NA
#> 7 123 123-02 b3 10 0
#> 8 124 124-01 b3 10 NA
#> 9 125 125-01 b3 10 NA
#> 10 126 126-01 b3 10 NA
Created on 2021-01-26 by the reprex package (v0.3.0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.