So I made a data frame of people's names, ages, and their favorite movies. I want to write a program that acts on the data frame to give me the average age of each person with a specific favorite movie. Here's what I have.
persons <- list(firstName = c("Steve","Bob","Bill","Chris","Matt","Evan"), lastName = c("Williams","Barker","Barker","Williams","Stevenson","Parker"), age = c(22,30,41,14,9,93), favoriteMovie = c("Alien","The Shining","The Shining","Halloween","Alien","Alien"))
d1 <- data.frame(persons$firstName,persons$lastName,persons$age,persons$favoriteMovie)
d1
persons.firstName persons.lastName persons.age persons.favoriteMovie
1 Steve Williams 22 Alien
2 Bob Barker 30 The Shining
3 Bill Barker 41 The Shining
4 Chris Williams 14 Halloween
5 Matt Stevenson 9 Alien
6 Evan Parker 93 Alien
So I can do it with a loop of if statements but I don't think this is the most efficient way to do this. I'm sure there's some sort of way to kind of single out values but I'm really not sure.
You could try using tapply
> with(d1, tapply(persons.age, persons.favoriteMovie, mean))
Alien Halloween The Shining
41.33333 14.00000 35.50000
You migth want to take a look at this answer
You can use by()
for this:
by(d1$persons.age, d1$persons.favoriteMovie, mean)
d1$persons.favoriteMovie: Alien
[1] 41.33333
-------------------------------------------------------------------------------------------------------------
d1$persons.favoriteMovie: Halloween
[1] 14
-------------------------------------------------------------------------------------------------------------
d1$persons.favoriteMovie: The Shining
[1] 35.5
The package doBy
with the function summaryBy
can help you.
library(doBy)
summaryBy(persons.age~persons.favoriteMovie, data=d1, FUN=c(mean))
#persons.favoriteMovie persons.age.mean
#1 Alien 41.33333
#2 Halloween 14.00000
#3 The Shining 35.50000
Or you could use dplyr
.
library(dplyr)
grouped <- group_by(d1, persons.favoriteMovie)
summarise(grouped, mean=mean(persons.age))
# persons.favoriteMovie mean
# (fctr) (dbl)
#1 Alien 41.33333
#2 Halloween 14.00000
#3 The Shining 35.50000
We can use data.table
library(data.table)
setDT(d1)[,.(persons.age = mean(persons.age)) , persons.favoriteMovie]
# persons.favoriteMovie persons.age
#1: Alien 41.33333
#2: The Shining 35.50000
#3: Halloween 14.00000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.