繁体   English   中英

在星火阶计算置信区间

[英]Calculating confidence interval in Spark scala

我有以下数据框:

+---------------+-----------+-------------+-----+----+----+--------------------+-------------------+------+--------------------+-------+-------+--------------------+
|   time_stamp_0|sender_ip_1|receiver_ip_2|count|rank|  xi|                  pi|                  r|attack|             myvalue|max_int|min_int|                 int|
+---------------+-----------+-------------+-----+----+----+--------------------+-------------------+------+--------------------+-------+-------+--------------------+
|12:18:52.702936|   10.0.0.1|     10.0.0.4|11139|   1|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:53.702976|   10.0.0.1|     10.0.0.4|11139|   2|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.702873|   10.0.0.1|     10.0.0.4|11139|   3|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:55.702825|   10.0.0.1|     10.0.0.4|11139|   4|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:56.703021|   10.0.0.1|     10.0.0.4|11139|   5|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:57.703786|   10.0.0.1|     10.0.0.4|11139|   6|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:58.706354|   10.0.0.1|     10.0.0.4|11139|   7|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:59.705885|   10.0.0.1|     10.0.0.4|11139|   8|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:14.703371|   10.0.0.1|     10.0.0.4|11139|   9|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:15.702891|   10.0.0.1|     10.0.0.4|11139|  10|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:16.703450|   10.0.0.1|     10.0.0.4|11139|  11|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:17.703087|   10.0.0.1|     10.0.0.4|11139|  12|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:18.704467|   10.0.0.1|     10.0.0.4|11139|  13|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:19.703472|   10.0.0.1|     10.0.0.4|11139|  14|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:20:20.703268|   10.0.0.1|     10.0.0.4|11139|  15|  15| 0.00134661998384056|0.49609480204686235|     0|0.008901370242045487|11139.0|11139.0|[11139.000, 11139...|
|12:18:52.995718|   10.0.0.5|     10.0.0.1|11139|   1|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:53.995478|   10.0.0.5|     10.0.0.1|11139|   2|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.995653|   10.0.0.5|     10.0.0.1|11139|   3|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:55.995978|   10.0.0.5|     10.0.0.1|11139|   4|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:56.994984|   10.0.0.5|     10.0.0.1|11139|   5|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:57.995190|   10.0.0.5|     10.0.0.1|11139|   6|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:58.994970|   10.0.0.5|     10.0.0.1|11139|   7|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:14.995142|   10.0.0.5|     10.0.0.1|11139|   8|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:15.995244|   10.0.0.5|     10.0.0.1|11139|   9|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:16.995481|   10.0.0.5|     10.0.0.1|11139|  10|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:17.995213|   10.0.0.5|     10.0.0.1|11139|  11|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:18.994985|   10.0.0.5|     10.0.0.1|11139|  12|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:19.994872|   10.0.0.5|     10.0.0.1|11139|  13|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:20.994932|   10.0.0.5|     10.0.0.1|11139|  14|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:52.995744|   10.0.0.1|     10.0.0.5|11139|   1|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:53.995496|   10.0.0.1|     10.0.0.5|11139|   2|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.995665|   10.0.0.1|     10.0.0.5|11139|   3|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:55.995986|   10.0.0.1|     10.0.0.5|11139|   4|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:56.994999|   10.0.0.1|     10.0.0.5|11139|   5|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:57.995204|   10.0.0.1|     10.0.0.5|11139|   6|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:58.995057|   10.0.0.1|     10.0.0.5|11139|   7|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:14.995169|   10.0.0.1|     10.0.0.5|11139|   8|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:15.995261|   10.0.0.1|     10.0.0.5|11139|   9|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:16.995499|   10.0.0.1|     10.0.0.5|11139|  10|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:17.995220|   10.0.0.1|     10.0.0.5|11139|  11|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:18.994997|   10.0.0.1|     10.0.0.5|11139|  12|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:19.994891|   10.0.0.1|     10.0.0.5|11139|  13|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:20:20.994951|   10.0.0.1|     10.0.0.5|11139|  14|  14|0.001256845318251...|0.49609480204686235|     0|0.008394658926763537|11139.0|11139.0|[11139.000, 11139...|
|12:18:52.811535|   10.0.0.1|     10.0.0.2|11139|   1|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:53.812029|   10.0.0.1|     10.0.0.2|11139|   2|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.480070|   10.0.0.1|     10.0.0.2|11139|   3|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.481196|   10.0.0.1|     10.0.0.2|11139|   4|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.483532|   10.0.0.1|     10.0.0.2|11139|   5|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.485713|   10.0.0.1|     10.0.0.2|11139|   6|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.487091|   10.0.0.1|     10.0.0.2|11139|   7|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|
|12:18:54.488272|   10.0.0.1|     10.0.0.2|11139|   8|5526| 0.49609480204686235|0.49609480204686235|     0|   0.347756620851195|11139.0|11139.0|[11139.000, 11139...|

我需要为“ myvalue”列计算置信区间,最小置信区间和最大置信区间(关于置信区间计算: http : //www.statisticshowto.com/how-to-find-a-confidence-interval/ )。 我使用以下代码:

 val cntInterval = final_add_count_rank_xi_pi_r_attack_antropy.select("myvalue").rdd.countApprox(timeout = 1000L,confidence = 0.95)
    val (lowCnt,highCnt) = (cntInterval.getFinalValue().low, cntInterval.getFinalValue().high)

    //Add the confidencial interval to df
    val final_integration_df=final_add_count_rank_xi_pi_r_attack_antropy.withColumn("max_int", lit(highCnt))
    .withColumn("min_int", lit(lowCnt))
    .withColumn("int", lit(cntInterval.getFinalValue().toString()))

    //Data becomes clean
    final_integration_df.show(100)

但是我的问题是,我的数据框中的所有三个值(置信区间,最小置信区间和最大置信区间)的置信区间为11139.0,等于“ 10.0.0.1”和“ 10.0.0.2”之间的连接数! (数据框中的计数列)您能帮我解决问题吗? 谢谢

据我了解,您想计算DataFrame中每一行的置信度。 为此,请使用UDF代替照明。 点亮功能将相同的数据插入每一行。

这是UDF的示例:

val df = spark.sparkContext.parallelize(Seq(1,2,3,4)).toDF("first")
import org.apache.spark.sql.functions.udf
val func = udf((i1: Int) => i1 + 3)
df.withColumn("sum", func(df("first"))).show

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM