[英]How to vectorize length-frequency calculation?
目前,我有一個很長的代碼,帶有for循環,用於計算數據集不同成熟度下各種長度的頻率,我想對代碼進行矢量化處理/找到更優雅的解決方案,但是到目前為止,我還無法找出如何做到這一點。 頻率計算是一個相對簡單的計算:( (count of occurances of a specific length at a certain maturity/total number of females or males)*100
示例數據:
Species Sex Maturity Length
1 HAK M 1 7
2 HAK M 2 24
3 HAK F 2 10
4 HAK M 3 25
5 HAK F 5 25
6 HAK F 4 12
我當前正在使用的代碼:
reps <- seq(min(Length), max(Length), by = 1)
m1 <- m2 <- m3 <- m4 <- m5 <- rep(NA, length(reps))
f1 <- f2 <- f3 <- f4 <- f5 <- rep(NA, length(reps))
# Makes vectors for each maturity stage for both sexes
# same length as the reps vector filled with NA for the loop:
# Loop:
for (i in 1:length(reps)) # repeats for each value of the x axis
{
m1[i]<- length(Length[Length == reps[i] & Sex == "M" & Maturity == 1])/total.m*100
m2[i]<- length(Length[Length == reps[i] & Sex == "M" & Maturity == 2])/total.m*100
m3[i]<- length(Length[Length == reps[i] & Sex == "M" & Maturity == 3])/total.m*100
m4[i]<- length(Length[Length == reps[i] & Sex == "M" & Maturity == 4])/total.m*100
m5[i]<- length(Length[Length == reps[i] & Sex == "M" & Maturity == 5])/total.m*100
f1[i]<- length(Length[Length == reps[i] & Sex == "F" & Maturity == 1])/total.f*100
f2[i]<- length(Length[Length == reps[i] & Sex == "F" & Maturity == 2])/total.f*100
f3[i]<- length(Length[Length == reps[i] & Sex == "F" & Maturity == 3])/total.f*100
f4[i]<- length(Length[Length == reps[i] & Sex == "F" & Maturity == 4])/total.f*100
f5[i]<- length(Length[Length == reps[i] & Sex == "F" & Maturity == 5])/total.f*100
}
#Stitching together the output of the loop.
males_all<-rbind(m1, m2, m3, m4, m5)
females_all<-rbind(f1, f2, f3, f4, f5)
這是我通常從循環中獲得的輸出:
mat X8 X9 X10 X11 X12 X14 X15
1 m1 0.104712 0.104712 0.6282723 1.3612565 1.884817 0.1047120 0.2094241
2 m2 0.000000 0.000000 0.3141361 0.8376963 2.198953 2.4083770 1.3612565
3 m3 0.000000 0.000000 0.0000000 0.0000000 0.104712 0.2094241 0.1047120
4 m4 0.000000 0.000000 0.0000000 0.0000000 0.000000 0.0000000 0.0000000
5 m5 0.000000 0.000000 0.0000000 0.0000000 0.000000 0.0000000 0.2094241
mat
后面的列是長度,為了簡潔起見,我沒有將所有列都包括在內,它們可能會增加到30左右。 females_all
看起來相同,只是在mat
列中帶有f1, f2
等。
據我所知,這就是您想要的:
library(dplyr)
counts = count(df, Sex, Maturity, Length)
totals = count(df, Sex, name = "total")
counts = counts %>% left_join(totals) %>%
mutate(prop = n / total)
# # Joining, by = "Sex"
# # A tibble: 6 x 6
# Sex Maturity Length n total prop
# <fct> <int> <int> <int> <int> <dbl>
# 1 F 2 10 1 3 0.333
# 2 F 4 12 1 3 0.333
# 3 F 5 25 1 3 0.333
# 4 M 1 7 1 3 0.333
# 5 M 2 24 1 3 0.333
# 6 M 3 25 1 3 0.333
counts %>% select(Sex, Maturity, Length, prop) %>%
tidyr::spread(key = Length, value = prop, fill = 0)
# # A tibble: 6 x 7
# Sex Maturity `7` `10` `12` `24` `25`
# <fct> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 F 2 0 0.333 0 0 0
# 2 F 4 0 0 0.333 0 0
# 3 F 5 0 0 0 0 0.333
# 4 M 1 0.333 0 0 0 0
# 5 M 2 0 0 0 0.333 0
# 6 M 3 0 0 0 0 0.333
使用此數據:
df = read.table(text = " Species Sex Maturity Length
1 HAK M 1 7
2 HAK M 2 24
3 HAK F 2 10
4 HAK M 3 25
5 HAK F 5 25
6 HAK F 4 12", header = T)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.