简体   繁体   English

R 2d图中的计数比率

[英]ratio of counts in R 2d plot

I have 2 continuous variables (X and Y) that I want to bin into a 2d grid.我有 2 个连续变量(X 和 Y),我想将它们合并到 2d 网格中。 Associated with every (x,y) pair I have a factor that is either PASS or FAIL.与每个 (x,y) 对相关联,我有一个通过或失败的因素。 I want to plot in a 2d grid the ratio of PASS/FAIL.我想在二维网格中绘制 PASS/FAIL 的比率。

For example, using the iris dataset: ggplot(iris, aes(x=Sepal.Length , y=Petal.Length)) + geom_bin2d() plots the total count in each 2d bin - how do I change this to plot the ratio of the count of virginica and versicolor in each bin?例如,使用 iris 数据集: ggplot(iris, aes(x=Sepal.Length , y=Petal.Length)) + geom_bin2d()绘制每个 2d bin 中的总计数 - 如何更改它以绘制比例每个垃圾箱中维吉尼亚和杂色的数量?

By using stat_summary2d() , data preprocessing (turn binary factor into numeric in dataframe) and use the z argument associated with the stat_summary2d() function.通过使用stat_summary2d() ,数据预处理(将二进制因子转换为数据帧中的数字)并使用与stat_summary2d()函数关联的 z 参数。

iris$tf <- as.numeric(as.logical(round(runif(nrow(iris)))))

ggplot(iris, aes(x=Sepal.Length , y=Petal.Length,z=tf)) +
stat_summary2d(bins = 10,binwidth = c(2)) + 
labs(title = "Ratio of T/F of Factor by Petal.Length and Sepal.Length") +
scale_fill_continuous(name = "Ratio")

Note: if you turn your binary factor to a numeric, it will coerce to 1/2 (instead of 0/1) by default, so subtract one off it.注意:如果您将二进制因子转换为数字,默认情况下它会强制为 1/2(而不是 0/1),因此减去一个。 If it is a logical, then this won't be necessary.如果这是一个逻辑,那么这将是不必要的。

Edit: added default fun='mean' argument to stat_summary2d() to make it clear this is the default behaviour of the function.编辑:向stat_summary2d()添加了 default fun='mean'参数以明确这是该函数的默认行为。

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM