简体   繁体   中英

How to extract certain rows for each variable in R?

Here is my data.frame:

    Data=structure(list(Year = structure(c(-25567, -25202, -24837, -24472, 
    -24107, -23741, -23376, -23011, -22646, -22280, 10592, 10957, 11323, 11688, 12053, 
    12418, 12784, 13149, 13514, 13879, 14245, 14610, 14975, 15340, 15706, 16071, 
  -25567, -25202, -24837, -24472, -24107, -23741, -23376, -23011, -22646, -22280), 
   class = "Date"),     variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
   1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
   1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("TT", 
  "CC"), class = "factor"), Mean = c(1,   7, 4, 3,8, 3, 6, 7, 5, 3, 4, 6, 3, 1, 13, 4, 
  18, 14, 16, 16, 17, 15, 15, 74, 19, 19, 0, 5, 18, 0.5, 3, 7, 0., 0, -1, -2), par = 
  c("h",     "h", "h", "h", "h", "h", "h", "h",     "h", "h", "h", "h", "h", "h", "h",     
  "h", "h", "h", "m", "m", "m", "m",     "m", "m", "m", "m", "h", "h", "h",     "h", 
  "h", "m", "m", "m", "m", "m"    )), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 
  9L, 10L,100L, 101L, 102L, 103L, 104L, 105L, 106L, 107L, 108L, 109L, 110L, 
  111L, 112L, 113L, 114L, 115L, 116L, 117L, 118L, 119L, 120L, 121L, 
  122L, 123L, 124L, 125L), class = "data.frame")

If I do this

   Data[3:5,], it will extract the rows all the dataframe. 

however, i need the lines 3:5 for each par and variable. for instance, the output will be

  3: 1902-01-01       TT  4.0   h
  4: 1903-01-01       TT  3.0   h
  5: 1904-01-01       TT  8.0   h
  21: 2009-01-01       TT 17.0   m
  22: 2010-01-01       TT 15.0   m
  23: 2011-01-01       TT 15.0   m
  29: 1902-01-01       CC 18.0   h
  30: 1903-01-01       CC  0.5   h
  31: 1904-01-01       CC  3.0   h
  34: 1907-01-01       CC  0.0   m
  35: 1908-01-01       CC -1.0   m
  36: 1909-01-01       CC -2.0   m


library( data.table )
setDT(Data)[, .SD[3:5], by = .(par, variable)]

You can group_by par and variable and use slice :

Data %>% group_by(par, variable) %>% slice(3:5) %>% ungroup

#    Year       variable  Mean par  
#   <date>     <fct>    <dbl> <chr>
# 1 1902-01-01 TT         4   h    
# 2 1903-01-01 TT         3   h    
# 3 1904-01-01 TT         8   h    
# 4 1902-01-01 CC        18   h    
# 5 1903-01-01 CC         0.5 h    
# 6 1904-01-01 CC         3   h    
# 7 2009-01-01 TT        17   m    
# 8 2010-01-01 TT        15   m    
# 9 2011-01-01 TT        15   m    
#10 1907-01-01 CC         0   m    
#11 1908-01-01 CC        -1   m    
#12 1909-01-01 CC        -2   m    

In base R:

subset(Data, ave(Mean, par, variable, FUN = seq_along) %in% 3:5)

Using dplyr

Data %>%
    group_by(par, variable) %>%
    filter(row_number() %in% 3:5)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM