在 R 中使用 ggplot histogram 代替 hist function

Question

我在 R 中使用称为 BetaMixture 的 package 来拟合数据向量的 beta 分布的混合。 output 提供给使用混合 model 组件产生良好直方图的hist ：

# Install and load the libraries

#install.packages("BetaModels")
library(BetaModels)

# Create a vector, fit mixture models and plot the histogram

vec <- c(rbeta(700, 5, 2), rbeta(300, 1, 10))
model <- BetaMixture(vec,2)
h <- hist(model, breaks = 35)

到目前为止，一切都很好。 现在我如何在ggplot中得到这个？ 我检查了h object 但这与model object 没有什么不同。 它们完全相同。 我不知道这个hist甚至对于这个 class 是如何工作的。 除了 @datavec 之外，它从model中提取了什么来生成这个@datavec ？

Answer 1

您可以使用getMethod("hist", "BetaMixture")获取BetaMixed对象的hist function 。
您可以在下面找到此 function 到“ ggplot2世界”的简单翻译。

myhist <- function (x, ...) {
    .local <- function (x, mixcols = 1:7, breaks=25, ...) 
    {
        df1 <- data.frame(x=x@datavec)
        p <- ggplot(data=df1, aes(x=x)) + 
             geom_histogram(aes(y=..density..), bins=breaks, alpha=0.5, fill="gray50", color="black")
        while (length(mixcols) < ncol(x@mle)) mixcols <- c(mixcols, 
            mixcols)
        xv <- seq(0, 1, length = 502)[1:501]
        for (J in 1:ncol(x@mle)) {
            y <- x@phi[J] * dbeta(xv, x@mle[1, J], x@mle[2, J])
            df2 <- data.frame(xv, y)
            p <- p + geom_line(data=df2, aes(xv, y), size=1, col=mixcols[J])
        }
        p <- p + theme_bw()
        invisible(p)
    }
    .local(x, ...)
}

library(ggplot2)
# Now p is a ggplot2 object.
p <- myhist(model, breaks=35)
print(p)

Answer 2

BetaMixture 返回的BetaMixture是一个 S4 class object，有 2 个感兴趣的插槽。

Slot Z返回属于每个分布的每个数据点的概率矩阵。
所以在前 6 行中，所有点都属于第二个分布。

head(model@Z)
#             [,1]      [,2]
#[1,] 1.354527e-04 0.9998645
#[2,] 4.463074e-03 0.9955369
#[3,] 1.551999e-03 0.9984480
#[4,] 1.642579e-03 0.9983574
#[5,] 1.437047e-09 1.0000000
#[6,] 9.911427e-04 0.9990089

slot mle返回参数的最大似然估计。

现在使用这些值来创建向量的 data.frame 和参数的 data.frame。

df1 <- data.frame(vec)
df1$component <- factor(apply(model@Z, 1, which.max))
colors <- as.integer(levels(df1$component))

params <- as.data.frame(t(model@mle))
names(params) <- c("shape1", "shape2")

Plot 数据。

library(ggplot2)

g <- ggplot(df1, aes(x = vec, group = component)) +
  geom_histogram(aes(y = ..density..),
                 bins = 35, fill = "grey", color = "grey40")

for(i in 1:nrow(params)){
  sh1 <- params$shape1[i]
  sh2 <- params$shape2[i]
  g <- g + stat_function(
    fun = dbeta,
    args = list(shape1 = sh1, shape2 = sh2),
    color = colors[i]
  )
}
suppressWarnings(print(g + theme_bw()))

在 R 中使用 ggplot histogram 代替 hist function

问题描述

2 个解决方案

解决方案1
3 已采纳 2021-12-04 16:04:05

解决方案2
2 2021-12-04 16:33:45

在 R 中使用 ggplot histogram 代替 hist function

问题描述

2 个解决方案

解决方案1 3 已采纳 2021-12-04 16:04:05

解决方案2 2 2021-12-04 16:33:45

解决方案1
3 已采纳 2021-12-04 16:04:05

解决方案2
2 2021-12-04 16:33:45