简体   繁体   English

中位数和箱线图 (R)

[英]Median and Boxplot (R)

I am writing to your forum because I do not find solution for my problem.我正在写信给您的论坛,因为我没有找到解决我的问题的方法。 I am trying to represent graphically the Median catching time (MCT) of mosquito that we (my team and I) have collected (I am currently in an internship to study the malaria in Ivory Coast).我试图以图形方式表示我们(我的团队和我)收集的蚊子的中位捕捉时间 (MCT)(我目前正在实习以研究象牙海岸的疟疾)。 The MCT represents the time for which 50% of the total malaria vectors were caught on humans. MCT 代表 50% 的疟疾媒介被人类感染的时间。 For example, we collected this sample:例如,我们收集了这个样本:

Hour of collection / Mosquitoes number:
20H-21H = 1
21H-22H = 1 
22H-23H = 2 
23H-00H = 2 
00H-01H = 13 
01H-02H = 10 
02H-03H = 15 
03H-04H = 15 
04H-05H = 8 
05H-06H = 10 
06H-07H = 6 

Here the effective cumulated is 83 mosquitoes.这里有效累积是83只蚊子。 And I am assuming that the median of this mosquito serie is 83+1/2 = 42 (And I don't even find this number on R), inducing a Median catching time at 2 am (02).我假设这个蚊子系列的中位数是 83+1/2 = 42(我什至没有在 R 上找到这个数字),从而导致在凌晨 2 点(02 点)的中位数捕捉时间。

Therefore, I have tried to use the function "boxplot" with different parameters, but I cannot have what I want to represent.因此,我尝试使用具有不同参数的函数“boxplot”,但我无法表达想要表达的内容。 Indeed, I have boxes for each hour of collection when I want the representation of the effective cumulated over the time of collection.事实上,当我想要表示在收集时间内累积的有效值时,我有一个用于每小时收集的框。 And the time use in R is "20H-21H" = 20, "21H-22H" = 21 etc. R中的时间使用是“20H-21H”=20,“21H-22H”=21等。

I have found an article (Nicolas Moiroux, 2012) who presents the Median Catching Time and a boxplot that I should like to have.我找到了一篇文章(Nicolas Moiroux,2012 年),其中介绍了中值捕捉时间和我想要的箱线图。 I copy the image of the cited boxplot: Boxplot_Moiroux2012我复制引用的箱线图的图像: Boxplot_Moiroux2012

Thank you in advance for your help, and I hope that my grammar is fine (I speak and write mainly in French, my mother tongue).在此先感谢您的帮助,我希望我的语法没问题(我的口语和写作主要是用法语,我的母语)。

Kind Regards, Edouard亲切的问候, 爱德华

PS : And regarding the code I have used with this set of data, here I am (with "Eff" = Number of mosquito and "Heure" = time of collection): PS:关于我在这组数据中使用的代码,我在这里(“Eff”=蚊子数量,“Heure”=收集时间):

sum(Eff)总和(效果)

as.factor(Heure) as.factor(Heure)

tapply(Eff,Heure,median) tapply(Heure,Eff,median) tapply(Eff,Heure,median) tapply(Heure,Eff,median)

boxplot(Eff,horizontal=T)箱线图(效果,水平= T)

boxplot(Heure~Eff) boxplot(Eff~Heur)) boxplot(Heure~Eff) boxplot(Eff~Heur))

(My skills on R are not very sharp...) (我的 R 技能不是很敏锐...)

You need to use a trick since you already have counts and not the time data for each catch.您需要使用一个技巧,因为您已经有了计数,而不是每个捕获的时间数据。

First, you convert your time values to a more continuous variable, then you generate a vector with all the time values and then you boxplot (with a custom axis).首先,您将您的时间值更连续变量,那么你生成所有的时间值的向量,然后你箱线图(用自定义轴)。

txt <- "20H-21H = 1
21H-22H = 1
22H-23H = 2
23H-00H = 2
00H-01H = 13
01H-02H = 10
02H-03H = 15
03H-04H = 15
04H-05H = 8
05H-06H = 10
06H-07H = 6"

dat <- read.table(text = txt, sep = "=",  h = F)
colnames(dat) <- c("collect_time", "nb_mosquito")

# make a continuous numerical proxy for time
dat$collect_time_num <- 1:nrow(dat)

# get values of proxy according to your data
tvals <- rep(dat$collect_time_num, dat$nb_mosquito)

# plot
boxplot(tvals, horizontal = T, xaxt = "n")
axis(1, labels = as.character(dat$collect_time), at = dat$collect_time_num)

outputs the following plot :输出以下图:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM