简体   繁体   中英

Tidyverse function

I am new(ish) to R and would need your help. I have a dataset with 5 levels of a treatment for a response variable. Assume, I measured soil N content at 5 levels (optimal, 40%, 30%, 20%, and 10%) of soil water content. And for each level I have 5 replicates. Now, I would like to calculate unstandardized (optimal - 40%, optimal -30%, optimal - 20%, optimal - 10%) and standardized (optimal - 40% / optimal, optimal - 30% / optimal, and so on) for each replicate. Is there any way to do this in R with tidyverse? I am still very new to make 'loop' functions. This would be a great help if someone can answer with a potential code.

(As noted in my comment above, it will be easier to answer your questions on this forum if you can share sample data, current code, and your expectations. Then potential answerers can have greater confidence that they're actually answering your question, vs. a version of what you question sounds like.)

Here's an approach using dplyr, where first we calculate the means for each level/treatment using group_by + summarize . Note, there were two dimensions of grouping ( treated + levels ), and summarize "peels off" the last one to be applied (in this case levels ). So after the summarize line, the data is still grouped by treated . We can using the brackets [] notation to specify the level to use for standardization. In this case, I am dividing each value by the "optimal" value within its respective treated group.

library(dplyr)
df %>%
  group_by(treated, levels) %>%
  summarize(avg_raw = mean(values)) %>%
  mutate(avg_standarized = avg_raw / avg_raw[levels == "optimal"]) %>%
  ungroup()

output

# A tibble: 10 x 4
   treated levels  avg_raw avg_standarized
   <lgl>   <chr>     <dbl>           <dbl>
 1 FALSE   10%       0.628           1.16 
 2 FALSE   20%       0.502           0.927
 3 FALSE   30%       0.370           0.684
 4 FALSE   40%       0.606           1.12 
 5 FALSE   optimal   0.541           1    
 6 TRUE    10%       0.608           1.55 
 7 TRUE    20%       0.371           0.945
 8 TRUE    30%       0.499           1.27 
 9 TRUE    40%       0.629           1.60 
10 TRUE    optimal   0.393           1   

Sample data

df <- data.frame(stringsAsFactors = FALSE,
                 levels = rep(c("optimal", "40%", "30%", "20%", "10%"), 4),
                 treated = rep(c(TRUE, FALSE), each = 10),
                 values = (sin(1:20)^2))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM