简体   繁体   中英

How to divide combinations of rows using dplyr or another method in R?

site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)

The final dataframe looks like this

在此处输入图像描述

For each site, I want to summarize various groups. Two for example: "A.low / A.high" = "sp.1/sp.1"; "A.low/ A.mix" = "sp.1/sp.2". As you will notice, there are two for each site and I want all permutations of that in my final columns. My final product would resemble something like:

site  rep   treatment      value
  1.  1/3.  A.low/A.high.   Inf
  1.  1/4.  A.low/A.high.   1

I started to use dplyr but I am really not sure how to proceed especially with all the combinations

  df.dummy %>% 
  group_by(site) %>% 
  summarise(value.1 = sp.1[treatment = "A.low"] / sp.1[treatment = "A.high"])

You could use reshape2 to get the data in a format that is easier to work with.

The code below separates out the sp.1 and sp.2 data. acast is used so that each dataframe consists of a single row per site, and each column is a unique sample with the values being from sp.1 and sp.2.

Name the columns something unique and combine the dataframes with cbind .

Now each column can be compared based on your requirements.

library(dplyr)
library(reshape2)

##your setup
site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)

##create unique ids and create a dataframe containing 1 value column
sp1 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.1)
sp2 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.2)

##reshape the data so that each treament and replicate is assigned a single column
##each row will be a single site
##each column will contain the values from sp.1 or sp.2
sp1 <- reshape2::acast(data = sp1, formula = site ~ id)
sp2 <- reshape2::acast(data = sp2, formula = site ~ id)

##rename columns something sensible and unique
colnames(sp1) <- c("low.1.sp1", "low.2.sp1", "high.3.sp1", "high.4.sp1",
                   "mix.5.sp1", "mix.6.sp1", "mix.7.sp1", "mix.8.sp1")
colnames(sp2) <- c("low.1.sp2", "low.2.sp2", "high.3.sp2", "high.4.sp2",
                   "mix.5.sp2", "mix.6.sp2", "mix.7.sp2", "mix.8.sp2")

##combine datasets
dat <- sp1 %>% cbind(sp2)

##choose which columns to compare. Some examples shown below
dat <-  dat %>% mutate(low.1.sp1/high.3.sp1, low.1.sp1/high.4.sp1,
                       low.2.sp1/high.3.sp2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM