繁体   English   中英

如何在R中创建累积分布表?

[英]How to create a table of cumulative distribution in R?

我已经使用ecdf绘制了速度的累积分布图,但我也想获得累积概率的输出,如下表所示:

Speed  Cumulative Probability
40  0.20
45  0.45
55  0.51
60  0.70
70  0.90
80  1.00

对于我的数据,当我使用ecdf它给出以下信息(请注意,“ cc”是我的原始数据帧):

> ccf <- subset(cc, cc$svel>=55 & cc$Headway>=4)  
> cdf<-  ecdf(ccf$svel)
> cdf
Empirical CDF 
Call: ecdf(ccf$svel)
 x[1:356] =     55,  55.01,  55.02,  ...,  76.76,   76.8

我如何像上面的示例一样获得表格? 请注意,我尝试了“ cumsum ”,但它只给出了累积频率,而我需要累积概率。

编辑

这是我的数据:

dput(ccf $ svel)c(67.9,67.62,67.37,67.19,67.04,66.93,66.83,66.74,66.65,66.55,66.46,66.36,66.25,66.12,65.97,61.12,61.2,61.29,61.39,61.49,61.58, 61.66,61.73,61.79,57.98,57.73,57.5,57.29,57.1,56.92,56.75,56.59,56.45,56.32,56.19,58,58.18,58.36,58.52,58.69,56.28,56.19,56.08,55.96,55.83,55.68, 55.52、55.34、55.15、58.58、58.89、59.17、59.4、59.58、55.01、55.14、55.23、55.3、55.36、55.41、55.47、55.53、55.59、55.66、55.74、55.83、55.92、56.03、56.16、56.3、56.44 56.58、56.71、56.82、56.91、56.98、57.03、57.06、57.07、57.07、57.06、57.04、57.02、55.05、55.22、55.39、55.56、55.73、55.92、56.11、56.31、56.53、56.77、57.02、57.02、57.28 57.79,58,58.18,58.32,58.43,58.5,58.56,58.6,58.64,58.68,58.73,58.8,58.86,58.92,58.97,59.01,59.03,59.05,59.05,59.04,59.02,58.99,58.97,58.95,55.1, 55.39,55.68,55.97,56.24,56.48,56.68,56.82,56.9,56.94,56.96,56.97,56.99,57.02,57.07,57.14,57.22,57.3,57.37,57.41,57.45,57.48,57.51,5 7.56、57.62、57.69、57.77、57.86、57.95、58.06、58.17、58.29、58.42、58.53、58.64、58.74、58.83、58.91、58.98、55.01、55.08、55.15、55.22、55.3、55.37、55.45、55.53、55.53、55.53 55.73、55.85、55.99、56.14、56.31、56.49、56.67、56.87、57.05、57.22、57.37、57.51、57.65、57.79、57.95、58.13、58.3、58.47、58.63、58.78、58.91、59.03、59.14、59.24、59。 59.43,59.53,59.62,59.72,59.81,59.9,59.98,60.07,60.15,60.22,60.31,60.39,60.47,60.56,60.65,60.75,60.86,60.98,61.11,61.24,61.39,61.54,61.71,61.89,62.09 62.31,62.56,62.84,63.14,63.46,63.78,64.08,64.81,64.84,64.85,64.87,64.89,64.92,64.94,64.97,65,65.02,65.04,65.07,65.11,65.15,65.17,65.18,65.17,65.15 65.13、65.1、65.06、65.01、64.96、64.9、64.84、64.79、64.76、55.04、55.15、55.25、55、55.23、55.45、55.68、55.9、56.69、56.74、55、55、55、55、55、55、55.01, 55.26、55.51、55.77、56.02、56.28、56.56、56.84、57.13、57.42、57.7、57.98、58.25、58.49、58.73、58.94、59.13、59.29、59.4、59.48、59.5、59.48、59.42、59。 .31,59.17,59,58.8,58.6,58.38,58.17,57.96,57.77,57.59,57.44,57.31,57.21,57.13,57.07,57.04,57.03,57.04,57.07,57.11,57.18,57.26,57.34,57.43,57.51 ,57.59、57.68、57.78、57.88、57.99、58.08、58.16、58.22、58.27、58.3、58.31、58.31、58.3、58.27、58.25、58.22、58.18、58.14、58.08、58.01、57.93、57.84、57.72、57.59、57。 ,57.27,57.1,56.93,56.77,56.63,56.5,56.38,56.28,56.19,56.12,56.05,55.99,55.94,55.9,55.88,55.86,55.85,55.86,55.87,55.89,55.9,55.91,55.91,55.88,55.84 ,55.78,55.71,55.63,55.56,55.5,55.45,55.4,55.37,55.34,55.32,55.3,55.29,55.27,55.26,55.26,55.25,55.25,55.26,55.26,55.27,55.28,55.29,55.31,55.33,55.36 ,55.39、55.02、55.07、55.12、55.16、55.21、55.26、55.31、55.04、55.21、55.38、55.54、55.71、55.88、56.05、56.21、56.38、56.54、56.71、56.88、57.04、57.2、57.35、55.46、55.26 ,55.74、55.92、56.11、56.32、56.54、56.77、57.02、57.28、55.22、55.28、55.35、55.42、55.5、55.58、55.68、55.78、55.88、56、55.15、55.45、55.72, 55.94,56.11,56.22,56.29,56.33,56.36,56.4,56.45,56.51,56.59,56.69,56.81,56.95,57.11,57.27,57.44,57.61,57.78,57.95,58.12,58.29,58.46,58.63,58.79,58。 59.08,59.21,59.32,59.41,55.13,55.3,55.47,55.65,55.83,56.02,56.22,56.43,56.66,56.9,55.17,56.02,56.11,56.21,56.32,56.42,56.52,57.18,57.29,57.42,76.27 76.28,76.3,76.33,76.37,76.41,76.47,76.54,76.62,76.7,76.76,76.8,76.8,55.08,55.16,55.24,55.32,55.4,55.48,55.12,55.39,55.67,55.94,56.21,56.47,56.72, 56.97、57.19、57.4、57.58、57.73、57.87、57.99、58.11)

这是一个可以实现的功能:

cumprob <- function(y) {
  fun <- function(y, x) length(y[y<x])/length(y)
  prob<-sapply(y, fun, y=y)
  data<- data.frame(value=unique(y[order(y)]), prob=unique(prob[order(prob)]))
}

测试您的数据(在这里我称之为data ):

cp<-cumprob(data)
head(cp)
  value       prob
1 55.00 0.00000000
2 55.01 0.01156069
3 55.02 0.01734104
4 55.04 0.01926782
5 55.05 0.02312139
6 55.07 0.02504817

情节:

plot(cp)

在此处输入图片说明

我觉得很方便的另一种快速而非正式的方法是使用hist函数自动cut数据并获取中点。

使用您的数据作为data

h <- hist(data)
cum.prob <- data.frame(value=h$mids, prob=cumsum(h$counts)/sum(h$counts))

那给你:

   cum.prob
   value      prob
1     55 0.2793834
2     57 0.6319846
3     59 0.8285164
4     61 0.8786127
5     63 0.8921002
6     65 0.9479769
7     67 0.9749518
8     69 0.9749518
9     71 0.9749518
10    73 0.9749518
11    75 0.9749518
12    77 1.0000000

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM