简体   繁体   English

逆变换柯西距离

[英]inverse transform cauchy dist r

I'm trying to use the inverse cumulative distribution method to plot a histogram from the standard cauchy distribution and I'm getting a strange plot that doesn't look like the textbook standard cauchy. 我正在尝试使用逆累积分布方法来绘制标准柯西分布的直方图,而我得到的奇怪图看起来不像教科书标准柯西。 I think I have my inverse function correct (x = tan(pi*(x - 1/2))) so I would appreciate some help. 我想我的逆函数正确(x = tan(pi *(x-1/2))),因此,我将感谢您的帮助。 Here is the r code that I have used: 这是我使用的r代码:

n <- 10000
u <- runif(n)
c.samp <- sapply(u, function(u) tan(pi*(u - 1/2)))
hist(c.samp, breaks = 90, col = "blue",
    main = "Hist of Cauchy")

The resulting plot just doesn't look correct: 结果图看起来不正确:

在此处输入图片说明

Any help is appreciated, thank you. 任何帮助表示赞赏,谢谢。

The histogram and sampling technique is correct. 直方图和采样技术是正确的。

Compare the results with the following (which uses the R Cauchy sampling function). 将结果与以下内容进行比较(使用R Cauchy采样函数)。

c.samp2 <- rcauchy(n)
hist(c.samp2, breaks = 90, col = "blue",
     main = "Hist of Cauchy 2")

The output here also look incorrect, but it is not. 这里的输出看起来也不正确,但事实并非如此。

First, you should note the x-axis is by default chosen based on the extreme values that you happen to encounter. 首先,您应该注意,默认情况下,x轴是根据您碰到的极限值选择的。 As you probably know, the Cauchy distribution is extremely fat-tailed and very large, but rare, values are expected. 如您所知,柯西分布的特征是非常肥大,非常大,但很少见。 When running 10000 samples from the Cauchy distribution, those relatively few single measurements squeeze the plot and do not show up on the plot because only very few observations are allocated to each bins in those extremes. 当从Cauchy分布中运行10000个样本时,相对较少的单个测量值会挤压图,并且不会显示在图上,因为在这些极端情况下,每个仓只分配了很少的观测值。

The default parameters of how hist chooses the bins are also poorly suited for distribution like the Cauchy. 如何默认参数hist选择了垃圾箱也适用于像柯西分布不佳。 Try eg 尝试例如

hist(c.samp2, breaks = "FD", col = "blue",
     bins = 50,
     main = "Hist of Cauchy 2",
     xlim = c(-500, 500))

在此处输入图片说明

I suggest to read the help("hist") page carefully and play around with the parameters to get a good and useful histogram. 我建议仔细阅读help("hist")页面,并使用这些参数来获得良好且有用的直方图。

By tweaking the chosen x-axis ranges, using an y-axis probability scale, adding the theoretical distribution and a "rug", you get something more useful. 通过调整选定的x轴范围,使用y轴概率标度,添加理论分布和“地毯”,您将获得更多有用的信息。

hist(c.samp, breaks = "FD", col = "blue",
     main = "Hist of Cauchy distribution",
     xlim = c(-50, 50),
     freq = FALSE)
curve(dcauchy, add = TRUE, col = "red")
rug(c.samp)

在此处输入图片说明

Note that using c.samp or c.samp2 now hardly changes the plot. 请注意,现在几乎不使用c.sampc.samp2更改绘图。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM