简体   繁体   中英

R geom_tile ggplot2 what kind of stat is applied?

I used geom_tile() for plot 3 variables on the same graph... with

tile_ruined_coop<-ggplot(data=df.1[sel1,])+
  geom_tile(aes(x=bonus, y=malus, fill=rf/300))+
  scale_fill_gradient(name="vr")+
  facet_grid(Seuil_out_coop_i ~ nb_coop_init)
tile_ruined_coop

and I am pleased with the result !

ggplot2 geom_tile示例

But What kind of statistical treatment is applied to fill ? Is this a mean ?

To plot the mean of the fill values you should aggregate your values, before plotting. The scale_colour_gradient(...) does not work on the data level, but on the visualization level. Let's start with a toy Dataframe to build a reproducible example to work with.

mydata = expand.grid(bonus = seq(0, 1, 0.25), malus = seq(0, 1, 0.25), type = c("Risquophile","Moyen","Risquophobe"))
mydata = do.call("rbind",replicate(40, mydata, simplify = FALSE))
mydata$value= runif(nrow(mydata), min=0, max=50)
mydata$coop = "cooperative"

Now, before plotting I suggest you to calculate the mean over your groups of 40 values, and for this operation like to use the dplyr package:

library(dplyr)
data = mydata %>% group_by("bonus","malus","type","coop") %>% summarise(vr=mean(value))

Tow you have your dataset ready to plot with ggplot2 :

library(ggplot2)
g = ggplot(data, aes(x=bonus,y=malus,fill=vr))
g = g + geom_tile()
g = g + facet_grid(type~coop)

and this is the result: 瓷砖图

where you are sure that the fill value is exactly the mean of your values.
Is this what you expected?

It uses stat_identity as can be seen in the documentation. You can test that easily:

DF <- data.frame(x=c(rep(1:2, 2), 1), 
                 y=c(rep(1:2, each=2), 1), 
                 fill=1:5)

#  x y fill
#1 1 1    1
#2 2 1    2
#3 1 2    3
#4 2 2    4
#5 1 1    5

p <- ggplot(data=DF) +
  geom_tile(aes(x=x, y=y, fill=fill))

print(p)

在此处输入图片说明

As you see the fill value for the 1/1 combination is 5. If you use factors it's even more clear what happens:

p <- ggplot(data=DF) +
  geom_tile(aes(x=x, y=y, fill=factor(fill)))

print(p)

在此处输入图片说明

If you want to depict means, I'd suggest to calculate them outside of ggplot2:

library(plyr)
DF1 <- ddply(DF, .(x, y), summarize, fill=mean(fill))
p <- ggplot(data=DF1) +
  geom_tile(aes(x=x, y=y, fill=fill))

print(p)

在此处输入图片说明

That's easier than trying to find out if stat_summary can play with geom_tile somehow (I doubt it).

scale_fill() and geom_tile() apply no statistics -or better apply stat_identity()- to your fill value=rf/300. It just computes how many colors you use and then generates the colors with the munsell function 'mnsl()'. If you want to apply some statistics only to the colors displayed you should use:

scale_colour_gradient(trans = "log")

or

scale_colour_gradient(trans = "sqrt")

Changing the colors among the tiles could not be the best idea since the plots have to be comparable, and you compare the values by their colours. Hope this helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM