简体   繁体   中英

R. I am trying to subset my data frame by decades. Therefore I want to subset by using values of a column

I have the year column from 1921 to 2020. I want to make an analysis based on the decades, so I want to subset the data frame into decades. I tried couple of codes but they keep giving errors.

decade1=data_all%>%filter(data_all$year%>%1920:1929)

Error: Problem with filter() input ..1 . x 3 arguments passed to ':' which requires 2 ℹ Input ..1 is data_all$year %>% 1920:1929 . Run rlang::last_error() to see where the error occurred.

decade1=data_all%>%filter(data_all$year==1920:1929)

Warning message: In data_all$year == 1920:1929: longer object length is not a multiple of shorter object length

What code should I be using?

We can change the syntax to %in% within filter

library(dplyr)
data_all%>% 
      filter(year %in% 1920:1929) 
It may help to group the years first.

df <- data.frame(
    year = sample(1920:2020,50,replace = TRUE)
)

df %>% 
  mutate( decade = cut(df$year, breaks=c(1910,1919,1929,1939,1949,1959,1969,1979,1989,1999,2009,2019,2029), 
          labels=c("1910s","1920s","1930s","1940s","1950s","1960s","1970s","1980s","1990s","2000s","2010s","2020s"))) %>%
  arrange(year)

my solution of the exercise was (quite similar to yours):

create new features, 'decade' (log10-based)

decades <- seq(1890,2010, by=10)

data$decade <- as.factor(data$Year %/% 10 * 10)

print(data$decade)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM