简体   繁体   中英

Combining multiple rows into one row with multiple columns of data R

I have a dataset where the same effect was measured in different ways and I want to compare those measurements. My dataset looks like this:

Study     MType     ID     Insect     Mean     Sd     N
Alla      Fecundity  1      Aphid      .62      .7628  11
Alla      RGR        1      Aphid      -32.8    7.76   11
Ando      Survival   2      Bee        2.34     .67    8
Ando      RGR        2      Bee        4.56     .34    10
Ando      Fecundity  2      Bee        5.32     4.3    20

I want to combine the rows by ID number so that the MType, Mean, Sd and N for each row are preserved (although the column names need to change so the columns are distinguishable).

Hopefully, in the end it would look like:

Study ID Insect Fecundity.mean Fecundity.Sd Fecundity.N RGR.mean RGR.Sd...etc

Some difficulties:

  1. There are about 10 different MTypes
  2. each ID number has between 2 and 4 MTypes

I have messed around with reshape and with tidyr and I haven't been able to figure out how to do this with either of them. Please help!

You can use reshape via base R. You want to transform your data from long to wide format according to this post: How to reshape data from long to wide format? .

If your data is in a data.frame d:

reshape(d, idvar=c("ID", "Study", "Insect"), timevar = "MType", direction="wide")

Results:

  Study ID Insect Mean.Fecundity Sd.Fecundity N.Fecundity Mean.RGR Sd.RGR N.RGR Mean.Survival Sd.Survival N.Survival
1  Alla  1  Aphid           0.62       0.7628          11   -32.80   7.76    11            NA          NA         NA
3  Ando  2    Bee           5.32       4.3000          20     4.56   0.34    10          2.34        0.67          8

Doing this with tidyr is not obvious, because you have to first gather() and then spread() .

library(tidyverse)
example <- tribble(
~Study, ~MType, ~ID, ~Insect, ~Mean,   ~Sd,   ~N,
"Alla", "Fecundity",  1, "Aphid", .62, .7628,  11,
"Alla", "RGR",   1, "Aphid", -32.8,  7.76, 11,
"Ando", "Survival", 2, "Bee",   2.34,   .67,  8,
"Ando", "RGR",   2, "Bee",   4.56,   .34,  10,
"Ando", "Fecundity",  2, "Bee",   5.32,   4.3,  20)

gather(example, key = "Statistic", value = "value", Mean, Sd, N) %>%
  unite(col="MType.Statistic", MType, Statistic, sep = ".") %>% 
  spread(key = MType.Statistic, value=value)
#> # A tibble: 2 x 12
#>   Study    ID Insect Fecundity.Mean Fecundity.N Fecundity.Sd RGR.Mean
#> * <chr> <dbl> <chr>           <dbl>       <dbl>        <dbl>    <dbl>
#> 1 Alla   1.00 Aphid           0.620        11.0        0.763   -32.8 
#> 2 Ando   2.00 Bee             5.32         20.0        4.30      4.56
#> # ... with 5 more variables: RGR.N <dbl>, RGR.Sd <dbl>,
#> #   Survival.Mean <dbl>, Survival.N <dbl>, Survival.Sd <dbl>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM