I have a Dataset with satisfaction scores (0-5) from airline passengers regarding multiple categories like cleanliness, seat comfort, gate location, etc.. The dataset also includes info about class, type of travel, age, and so on.
I want to find out wether business class travelers are (on average) more satisfied in every category than economy class travelers.
I know that I can just check for the mean satisfaction scores of category A...n, grouped by class. (see below)
library(dplyr)
final_dataset %>%
group_by(Class) %>%
summarise_at(vars(Cleanliness), list(mean = mean))
That way I will know what the mean for the different classes is for a given category. I've tried that and it works. This is a lot of effort though and doesn't really look great. There has to be a better way so I can see a list of categories and which class is most satisfied, right?
Class is a factor (find the code below), while the satisfaction scores are doubles.
final_dataset$Class <- as.factor(final_dataset$Class)
I've tried this (but it didn't work. Don't even exactly know, what it does):
library( data.table )
setDT( final_dataset )
final_dataset[ , .( mean.change = mean( "Cleanliness" ) ),
by = Class
][ , Class[ which.max( mean.change ) ] ]
The error message reads:
Error in
[.data.table
(final_dataset, , .(mean.change = mean("Cleanliness")), : fastmean was passed type character, not numeric or logical>
I read something about providing sample data in other posts while looking for solutions but have no clue if this is how to do it. I tried to insert a little bit as a sample. Just for reference: this is where I gut the dataset.
ID Class Check-in Service Online Boarding Gate Location Cleanliness
<chr> <dbl> <dbl>
1 Business 3 3 4 3
2 Economy Plus 2 2 3 5
3 Economy 2 2 3 2
4 Business 4 4 4 5
5 Economy 1 1 3 2
I hope that is all you need to understand my question, I'm fairly new to this.
Thanks in advance for your help!
I don't exactly sure what you want but here is my attempt with data.table
package. Tidyverse is essential for the R by the way. I don't understand what you meant by "doesn't really look great":)
df<-tibble(Class=c("Business","Economy Plus","Economy","Business"),service1=c(1,2,3,4),service2=c(1,2,3,4),service3=c(1,2,3,4),service4=c(1,2,3,4))
df$Class <- as.factor(df$Class)
dummy data:
# A tibble: 4 x 5
Class service1 service2 service3 service4
<chr> <dbl> <dbl> <dbl> <dbl>
1 Business 1 1 1 1
2 Economy Plus 2 2 2 2
3 Economy 3 3 3 3
4 Business 4 4 4 4
--
library(data.table)
df<-as.data.table(df)
df<-df[,.(satisfaction=mean(c(service1,service2,service3,service4))),by=Class]
output:
Class satisfaction
1: Business 2.5
2: Economy Plus 2.0
3: Economy 3.0
Hope this helps you.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.