简体   繁体   中英

Data analyzing in R with nycfights13 package

I'm trying to find out which destinations have the highest rate of delayed flights. For example of LAX has 10 flights and 3 of them are delayed the delayed rate for LAX would be 30%. This is what I have so far I just can't get the formula right.

flights %>% 
  group_by(dest) %>% 
  summarise(delay_rate = n_distinct(flight) / n_distinct(dep_delay)) %>% 
  arrange(desc(delay_rate)) %>% 
  view()
flights %>% 
  group_by(dest) %>% 
  summarise(delay_rate = n_distinct(dep_delay > 0) / n() * 100) %>% 
  arrange(desc(delay_rate)) %>% 
  View()

where n_distinct(dep_delay > 0) is the number of flights with delay and n() is the number of total flights

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM