简体   繁体   English

嵌套分类变量,引导程序,然后在 R 中提取中位数

[英]nesting categorical variable, bootstrap, then extract median in R

I'm having trouble with what seems like a simple solution.我在看似简单的解决方案上遇到了麻烦。 I have a data frame with some locations and each location has a value associated with it.我有一个包含一些locations的数据框,每个位置都有一个与之关联的value I nested the data.frame by the locations and then bootstrapped the values using purrr (see below).我按位置嵌套了 data.frame,然后使用purrr引导值(见下文)。

library(tidyverse)
library(modelr)
library(purrr)

locations <- c("grave","pinkham","lower pinkham", "meadow", "dodge", "young")

values <- rnorm(n = 100, mean = 3, sd = .5)
df <- data.frame(df)

df.boot <- df %>% 
  nest(-locations) %>% 
  mutate(boot = map(data,~bootstrap(.,n=100, id = "values")))

Now I'm trying to get the median from each bootstrap in the final list df.boot$boot , but can't seem to figure it out?现在我试图从最终列表df.boot$boot中的每个引导程序中获取中值,但似乎无法弄清楚? I've tried to apply map(boot, median) but the more I dig in the more that doesn't make sense.我尝试应用map(boot, median)但我越深入挖掘就越没有意义。 The wanted vector in the boot list is idx from which I can get the median value and then store it (pretty much what boot function does but iterating by unique categorical variables). boot列表中所需的向量是idx ,我可以从中获取中值然后存储它(几乎与boot函数的作用相同,但通过唯一的分类变量进行迭代)。 Any help would be much appreciated.任何帮助将非常感激。 I might just be going at this the wrong way...我可能只是走错了路......

If we need to extract the median如果我们需要提取中median

library(dplyr)
library(purrr)
library(modelr)
out <- df %>%
         group_by(locations) %>% 
         nest %>% 
         mutate(boot = map(data, ~ bootstrap(.x, n = 100, id = 'values') %>%
                                 pull('strap') %>% 
                                 map_dbl(~ as_tibble(.x) %>% 
                                          pull('values') %>%
                                          median)))
out
# A tibble: 6 x 3
# Groups:   locations [6]
#  locations     data              boot       
#  <fct>         <list>            <list>     
#1 pinkham       <tibble [12 × 1]> <dbl [100]>
#2 lower pinkham <tibble [17 × 1]> <dbl [100]>
#3 meadow        <tibble [16 × 1]> <dbl [100]>
#4 dodge         <tibble [22 × 1]> <dbl [100]>
#5 grave         <tibble [21 × 1]> <dbl [100]>
#6 young         <tibble [12 × 1]> <dbl [100]>

data数据

df <- data.frame(values, locations = sample(locations, 100, replace = TRUE))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM