[英]R: How to create a data frame with one observation for each combination of factors
[英]How to find the percentile for each observation in a data frame in R?
假設我們有一個簡單的數據框:
structure(c(2, 4, 5, 6, 8, 1, 2, 4, 6, 67, 8, 11), dim = c(6L,
2L), dimnames = list(NULL, c("lo", "li")))
如何找到兩個變量的每個觀察值的百分位數?
最 R 友好的方法是(i)將其轉換為 dataframe(或 tibble),(ii)將數據重塑為長格式,(iii)groupby lo 和 li,以及(iv)計算百分比排名。
這是代碼:
df%>%
as_tibble() %>% # convert to dataframe
gather(key=variable,value=value) %>% # gather into long form
group_by(variable)%>%. # group by lo and li
mutate(percentile=percent_rank(val)*100) # make new column
variable val percentile
<chr> <dbl> <dbl>
1 lo 2 20
2 lo 4 40
3 lo 5 60
4 lo 6 80
5 lo 8 100
6 lo 1 0
7 li 2 0
8 li 4 20
9 li 6 40
10 li 67 100
11 li 8 60
12 li 11 80
如果你不想讓 dataframe 變長,只需將兩列分開:
df%>%
as_tibble()%>%
mutate(lo_pr=percent_rank(lo)*100)%>%
mutate(li_percentile=percent_rank(li)*100)
lo li lo_pr li_percentile
<dbl> <dbl> <dbl> <dbl>
1 2 2 20 0
2 4 4 40 20
3 5 6 60 40
4 6 67 80 100
5 8 8 100 60
6 1 11 0 80
這是一個dplyr
方法來獲得中位數、5% 和 95% 分位數。
library(tidyverse)
data = structure(c(2, 4, 5, 6, 8, 1, 2, 4, 6, 67, 8, 11), dim = c(6L,
2L), dimnames = list(NULL, c("lo", "li")))
data %>%
as.data.frame() %>% # Coerce to dataframe
pivot_longer(cols = everything()) %>% # Pivot to long format
group_by(name) %>% # For each unique group..
summarise(perc5 = quantile(value, 0.05), # Calculate 5% quantile
median = median(value), # Calculate median
perc95 = quantile(value, 0.95)) # Calculate 95% quantile
#> # A tibble: 2 × 4
#> name perc5 median perc95
#> <chr> <dbl> <dbl> <dbl>
#> 1 li 2.5 7 53
#> 2 lo 1.25 4.5 7.5
創建於 2023-01-27,使用reprex v2.0.2
data.table 解決方案
library(data.table)
data <- data.table(data)
q <- c(0.05, 0.95)
melt(data, measure.vars = names(data))[, setNames(as.list(quantile(value, q)), paste("q", q * 100, sep = "_")), variable]
結果
variable q_5 q_95
1: lo 1.25 7.5
2: li 2.50 53.0
數據
data = structure(
c(2, 4, 5, 6, 8, 1, 2, 4, 6, 67, 8, 11),
dim = c(6L, 2L),
dimnames = list(NULL, c("lo", "li"))
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.