简体   繁体   English

如何在R中将条形直方图转换为线直方图

[英]How to convert a bar histogram into a line histogram in R

I've seen many examples of a density plot but the density plot's y-axis is the probability.我看过很多密度图的例子,但密度图的 y 轴是概率。 What I am looking for a is a line plot (like a density plot) but the y-axis should contain counts (like a histogram).我正在寻找的是线图(如密度图),但 y 轴应包含计数(如直方图)。

I can do this in excel where I manually make the bins and the frequencies and make a bar histogram and then I can change the chart type to a line - but can't find anything similar in R.我可以在 excel 中做到这一点,我手动制作箱和频率并制作条形直方图,然后我可以将图表类型更改为一条线 - 但在 R 中找不到任何类似的东西。

I've checked out both base and ggplot2;我已经检查了 base 和 ggplot2; yet can't seem to find an answer.但似乎无法找到答案。 I understand that histograms are meant to be bars but I think representing them as a continuous line makes more visual sense.我知道直方图应该是条形图,但我认为将它们表示为一条连续线更具有视觉效果。

Using default R graphics (ie without installing ggplot) you can do the following, which might also make what the density function does a bit clearer:使用默认的 R 图形(即不安装 ggplot),您可以执行以下操作,这也可能使密度函数的作用更加清晰:

# Generate some data
data=rnorm(1000)
# Get the density estimate
dens=density(data)
# Plot y-values scaled by number of observations against x values
plot(dens$x,length(data)*dens$y,type="l",xlab="Value",ylab="Count estimate")

This is an old question, but I thought it might be helpful to post a solution that specifically addresses your question.这是一个老问题,但我认为发布一个专门解决您的问题的解决方案可能会有所帮助。

In ggplot2, you can plot a histogram and display the count with bars using:在 ggplot2 中,您可以绘制直方图并使用条形显示计数:

ggplot(data) +  
geom_histogram()

You can also plot a histogram and display the count with lines using a frequency polygon:您还可以绘制直方图并使用频率多边形用线条显示计数:

ggplot(data) + 
geom_freqpoly()

For more info -- ggplot2 reference欲了解更多信息——ggplot2 参考

To adapt the example on the ?stat_density help page:调整?stat_density帮助页面上的示例:

m <- ggplot(movies, aes(x = rating))
# Standard density plot.
m + geom_density()
# Density plot with y-axis scaled to counts.
m + geom_density(aes(y = ..count..))

Although this is old, I thought the following might be useful.虽然这是旧的,但我认为以下内容可能有用。 Let's say you have a data set of 10,000 points, and you believe they belong to a certain distribution, and you would like to plot the histogram of the actual data and the line of the probability density of the ideal distribution on top of it.假设您有一个包含 10,000 个点的数据集,并且您相信它们属于某个分布,并且您想在其上绘制实际数据的直方图和理想分布的概率密度线。

noise <- 2
#
# the noise is tagged onto the end using runif
# just do demo issues w/real data and fitting
# the subtraction causes the data to have some
# negative values, which must be addressed in 
# the fit later on
#
noisylognorm <- rlnorm(10000, 
                        mean = 0.25, 
                        sd = 1) + 
                        (noise * runif(10000) - noise / 10)
#
# using package fitdistrplus
#
# subset is used to remove the negative values
# as the lognormal distribution needs positive only
#
fitlnorm <- fitdist(subset(noisylognorm, 
                           noisylognorm > 0),
                           "lnorm")
fitlnorm_density <- density(rlnorm(10000, 
                                   mean = fitlnorm$estimate[1],
                                   sd = fitlnorm$estimate[2]))
hist(subset(noisylognorm, 
            noisylognorm < 25),
     breaks = seq(-1, 25, 0.5),
     col = "lightblue",
     xlim = c(0, 25),
     xlab = "value",
     ylab = "frequency",
     main = paste0("Log Normal Distribution\n",
                   "noise = ", noise))

lines(fitlnorm_density$x, 
      10000 * fitlnorm_density$y * 0.5,
      type="l",
      col = "red")

Note the * 0.5 in the lines function.注意lines 函数中的* 0.5。 As far as I can tell, this is necessary to account for the width of the hist() bars.据我所知,这对于说明 hist() 条的宽度是必要的。

There is a very simple and fast way for count data.有一种非常简单快捷的方式来统计数据。

First let's generate some dummy count data:首先让我们生成一些虚拟计数数据:

my.count.data = rpois(n = 10000, lambda = 3)

And then the plotting command (assuming you have called library(magrittr)):然后是绘图命令(假设您已调用 library(magrittr)):

my.count.data %>% table %>% plot

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM