简体   繁体   English

如何根据 R 中的列值 select(四)特定行(多次)?

[英]How to select (four) specific rows (multiple times) based on a column value in R?

I only want to select the ID's that are in my dataframe for all years, from 2013 untill 2016 (so four times).我只想 select 在我的 dataframe 中的所有年份,从 2013 年到 2016 年(所以四次)。 In that case ID's with only four rows are left (panel data, each ID has 1 row for each year).在这种情况下,只剩下四行的 ID(面板数据,每个 ID 每年都有 1 行)。 I already made sure my dataframe only covers the years I need (2013, 2014, 2015, and 2016), but I want to exclude the ID's that have less than 4 years/rows in my dataframe.我已经确定我的 dataframe 仅涵盖我需要的年份(2013、2014、2015 和 2016),但我想排除在我的 dataframe 中少于 4 年/行的 ID。

This is the structure of my dataframe:这是我的dataframe的结构:

 tibble [909,587 x 26] (S3: tbl_df/tbl/data.frame)
     $ ID                         : num [1:909587] 12 12 12 12 16 16 16 16...
     $ Gender                     : num [1:909587] 2 2 2 2 1 1 1 1 1 1 ...
      ..- attr(*, "format.spss")= chr "F10.0"
     $ Year                       : chr [1:909587] "2016" "2013" "2014" "2015" ...
      ..- attr(*, "format.spss")= chr "F9.3"
     $ Size                       : num [1:909587] 1983 1999 1951 1976 902 ...
     $ Costs                      : num [1:909587] 2957.47 0 0.34 1041.67 0 ...
     $ Urbanisation               : num [1:909587] 2 3 3 2 3 3 2 2 2 3 ...
     $ Age                        : num [1:909587] 92 89 90 91 82 83 22 23 24 65 ...

How can I achieve that?我怎样才能做到这一点?

Thank you!谢谢!

Pivot your df Pivot 你的df

df %>% pivot_wider(names_from = Year,values_from = Age)

Filter the na's rows out of columns 2013,2014,2015,2016从 2013,2014,2015,2016 列中过滤出 na 的行

Pivot back Pivot 背部

df %>% pivot_longer(2013:2016)

Just to capture @Jasonaizkains answer from the comments field above, since pivoting is not strictly necessary in this case with some play data.只是为了从上面的评论字段中捕获@Jasonaizkains 的答案,因为在这种情况下,对于一些播放数据来说,旋转并不是绝对必要的。

library(dplyr)
id <- rep(10:13, 4) # four subjects
year <- rep(2013:2016, each = 4) # four years
gender <- sample(1:2, 16, replace = TRUE)
play <- tibble(id, gender, year) # data.frame of 16

play <- play[-9,] # removes row for id 10 in 2015

# Removes all entries for the right id number
play %>% group_by(id) %>% filter(n_distinct(year) >= 4) %>% ungroup()
#> # A tibble: 12 x 3
#>       id gender  year
#>    <int>  <int> <int>
#>  1    11      1  2013
#>  2    12      2  2013
#>  3    13      2  2013
#>  4    11      1  2014
#>  5    12      2  2014
#>  6    13      1  2014
#>  7    11      2  2015
#>  8    12      2  2015
#>  9    13      2  2015
#> 10    11      2  2016
#> 11    12      2  2016
#> 12    13      1  2016

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM