如何在r中生成概率密度函数和期望值？

Question

The task: 任务：

Eric the fly has a friend, Ernie. Eric the fly有一个朋友，Ernie。 Assume that the two flies sit at independent locations, uniformly distributed on the globe's surface. 假设两只苍蝇坐在独立的位置，均匀分布在地球表面。 Let D denote the Euclidean distance between Eric and Ernie (ie, on a straight line through the interior of the globe). 设D表示Eric和Ernie之间的欧几里德距离（即，穿过地球内部的直线）。

Make a conjecture about the probability density function of D and give an estimate of its expected value, E(D). 对D的概率密度函数进行猜想，并给出其期望值E（D）的估计。

So far I have made a function to generate two points on the globe's surface, but I am unsure what to do next: 到目前为止，我已经完成了在地球表面生成两个点的功能，但我不确定下一步该做什么：

sample3d <- function(2)
  {
  df <- data.frame()
  while(n > 0){
    x <- runif(1,-1,1)
    y <- runif(1,-1,1)
    z <- runif(1,-1,1)
    r <- x^2 + y^2 + z^2
    if (r < 1){
      u <- sqrt(x^2+y^2+z^2)
      vector = data.frame(x = x/u,y = y/u, z = z/u)
      df <- rbind(vector,df)
      n = n- 1
    }
  }
  df
}
E <- sample3d(2)

Answer 1

This is an interesting problem. 这是一个有趣的问题。 I'll outline a computational approach; 我将概述一种计算方法; I'll leave the math up to you. 我会把数学留给你。

First we fix a random seed for reproducibility. 首先，我们修复随机种子以获得可重复性。
```
 set.seed(2018); 
```

We sample 10^4 points from the unit sphere surface. 我们从单位球面采样10^4个点。

 sample3d <- function(n = 100) { df <- data.frame(); while(n > 0) { x <- runif(1,-1,1) y <- runif(1,-1,1) z <- runif(1,-1,1) r <- x^2 + y^2 + z^2 if (r < 1) { u <- sqrt(x^2 + y^2 + z^2) vector = data.frame(x = x/u,y = y/u, z = z/u) df <- rbind(vector,df) n = n- 1 } } df } df <- sample3d(10^4);

Note that sample3d is not very efficient, but that's a different issue. 请注意， sample3d效率不高，但这是一个不同的问题。

We now randomly sample 2 points from df , calculate the Euclidean distance between those two points (using dist ), and repeat this procedure N = 10^4 times. 我们现在从df随机抽取2个点，计算这两个点之间的欧几里德距离（使用dist ），并重复此过程N = 10^4次。
```
 # Sample 2 points randomly from df, repeat N times N <- 10^4; dist <- replicate(N, dist(df[sample(1:nrow(df), 2), ])); 
```
As pointed out by @JosephWood, the number N = 10^4 is somewhat arbitrary. 正如@JosephWood指出的那样， N = 10^4的数字有些随意。 We are using a bootstrap to derive the empirical distribution. 我们使用bootstrap来推导经验分布。 For N -> infinity one can show that the empirical bootstrap distribution is the same as the (unknown) population distribution (Bootstrap theorem). 对于N -> infinity可以证明经验自助分布与（未知）种群分布（Bootstrap定理）相同。 The error term between empirical and population distribution is of the order 1/sqrt(N) , so N = 10^4 should lead to an error around 1%. 经验和人口分布之间的误差项为1/sqrt(N) ，因此N = 10^4应导致1％左右的误差。
We can plot the resulting probability distribution as a histogram: 我们可以将得到的概率分布绘制为直方图：
```
 # Let's plot the distribution ggplot(data.frame(x = dist), aes(x)) + geom_histogram(bins = 50); 
```

Finally, we can get empirical estimates for the mean and median. 最后，我们可以得到平均值和中位数的经验估计值。
```
 # Mean mean(dist); #[1] 1.333021 # Median median(dist); #[1] 1.41602 
```
These values are close to the theoretical values: 这些值接近理论值：
```
 mean.th = 4/3 median.th = sqrt(2) 
```

如何在r中生成概率密度函数和期望值？

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-03-20 02:24:26

如何在r中生成概率密度函数和期望值？

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-03-20 02:24:26

解决方案1
2 已采纳 2018-03-20 02:24:26