I have a dataframe that contains some raw data. Lets take an example and use the data sample "iris".
# load a data sample
data("iris")
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#1 5.1 3.5 1.4 0.2 setosa
#2 4.9 3.0 1.4 0.2 setosa
#3 4.7 3.2 1.3 0.2 setosa
# ...
I have an other dataframe which contains summarized data on the species.
species <- data.frame(unique(iris$Species))
colnames(species) <- "s"
# Add a zoom level
species$zoom <- c(2,3,5)
# species zoom
# 1 setosa 2
# 2 versicolor 3
# 3 virginica 5
I would like to add to this summarized dataframe (called species
in this example) a calculated column.
I tried both
species$mean <- species$zoom * mean(iris$Sepal.Length)
# (AND)
species$mean <- species$zoom * mean(iris$Sepal.Length[iris$Species==species$s])
but the first one isn't working because it is doing the calculation on all raw data, it doesn't group by species. The second one doesn't appear to work too.
Could I do this without looping on rows?
Perhaps this data.table
approach van help you out?
data("iris")
library(data.table)
setDT( iris )[ , list( mean = mean( Sepal.Length ) ), by=Species][, mean_mult := mean * c(2,3,5)][]
# Species mean mean_mult
# 1: setosa 5.006 10.012
# 2: versicolor 5.936 17.808
# 3: virginica 6.588 32.940
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.