I have a flight database giving route details and looks like this
Ori. Dest Carr. Pass Flights
JFK LAX Delta 15004 50
JFK LAX JetBl 17434 100
JFK BOS Delta 15344 89
ATL FLR AmerA 25054 90
OHD LAX Delta 19876 95
OHD LAX AmerA 12344 45
For output, I only need routes which have only 1 carrier The output should look like this -
JFK BOS Delta 15344 89
ATL FLR AmerA 25054 90
How to do this in R?
You can use:
library(dplyr)
df %>% group_by(Ori., Dest) %>% filter(n() == 1)
# Ori. Dest Carr. Pass Flights
# <chr> <chr> <chr> <int> <int>
#1 JFK BOS Delta 15344 89
#2 ATL FLR AmerA 25054 90
Using data.table
a
library(data.table)
setDT(df)[, .SD[.N == 1], .(Ori., Dest)]
and base R:
subset(df, ave(Flights, Ori., Dest, FUN = length) == 1)
data
df <- structure(list(Ori. = c("JFK", "JFK", "JFK", "ATL", "OHD", "OHD"
), Dest = c("LAX", "LAX", "BOS", "FLR", "LAX", "LAX"), Carr. = c("Delta",
"JetBl", "Delta", "AmerA", "Delta", "AmerA"), Pass = c(15004L,
17434L, 15344L, 25054L, 19876L, 12344L), Flights = c(50L, 100L,
89L, 90L, 95L, 45L)), class = "data.frame", row.names = c(NA, -6L))
We can do this without any group by operation in base R
df[!(duplicated(df[1:2])|duplicated(df[1:2], fromLast = TRUE)),]
# Ori. Dest Carr. Pass Flights
#3 JFK BOS Delta 15344 89
#4 ATL FLR AmerA 25054 90
df <- structure(list(Ori. = c("JFK", "JFK", "JFK", "ATL", "OHD", "OHD"
), Dest = c("LAX", "LAX", "BOS", "FLR", "LAX", "LAX"), Carr. = c("Delta",
"JetBl", "Delta", "AmerA", "Delta", "AmerA"), Pass = c(15004L,
17434L, 15344L, 25054L, 19876L, 12344L), Flights = c(50L, 100L,
89L, 90L, 95L, 45L)), class = "data.frame", row.names = c(NA, -6L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.