简体   繁体   中英

Conditionally overwriting a list using the purrr package in R?

Say I have the dataset below. It contains the non-centrality parameter (NCP), the degrees of freedom (DF), and the number of simulations (10,000) for each party's candidate in three states. As you can see, some races don't have candidates for a given party:

dat <- tibble(state = c("Iowa", "Wisconsin", "Minnesota"), 
              ncp_D = c(0, 11000, 5700),
              ncp_R = c(10000, 12000, 5000), 
              ncp_Ind = c(1800, 0, 600),
              df_D = c(10),
              df_R = c(10),
              df_Ind = c(10),
              sims_D = c(10000),
              sims_R = c(10000),
              sims_Ind = c(10000))

I would like the code to produce 10,000 simulations for each candidate in the three states using the purrr package. Below is the code I use to initiate this process based off of the t-distribution ( rt() ):

dat_results <- dat %>% 
  mutate(DVotes = pmap(list(sims_D, df_D, ncp_D), rt),
         RVotes = pmap(list(sims_R, df_R, ncp_R), rt),
         IndVotes = pmap(list(sims_Ind, df_Ind, ncp_Ind), rt))

This produces three lists of vote possibilities in the dat_results dataframe, but I ultimately want the lists that are produced for a candidate to be full of zeroes if their ncp value is zero. For instance, the D candidate in Iowa should have their predicted values based off the rt() function to be 10,000 zeroes instead of values that use 0 as its NCP, thus yielding some negative values. Same with the Ind candidate in Wisconsin. Essentially I'm trying to conditionally overwrite a list in a dataframe.

Is there an easy way to do this in R, preferably using the purrr package? Thanks in advance.

In your case, I think the easiest is to change the rt() function:

cond_rt <- function(n, df, ncp, ...){
  if(ncp == 0) return(rep(0, n))
  rt(n, df, ncp, ...)
}

Then simply use that modified version:

dat_results <- dat %>% 
  mutate(DVotes = pmap(list(sims_D, df_D, ncp_D), cond_rt),
         RVotes = pmap(list(sims_R, df_R, ncp_R), cond_rt),
         IndVotes = pmap(list(sims_Ind, df_Ind, ncp_Ind), cond_rt))

map_dbl(dat_results$DVotes, length)
#> [1] 10000 10000 10000
map_dbl(dat_results$DVotes, sum)
#> [1]         0 119262980  61756273

But if you really wanted to conditionally modify a column a posteriori , that could be done with mutate() and if_else() . We just run into a problem as we need to read and write list-elements, this can be solved with rowwise() (to read a single row element at a time) and calling list() on the output, so that we obtain a list of length 1 that can be inserted as an element.


dat_results2 <- dat %>% 
  mutate(DVotes = pmap(list(sims_D, df_D, ncp_D), rt),
         RVotes = pmap(list(sims_R, df_R, ncp_R), rt),
         IndVotes = pmap(list(sims_Ind, df_Ind, ncp_Ind), rt)) %>%
  rowwise() %>%
  mutate(DVotes = if_else(ncp_D == 0, list(rep(0, length(DVotes))), list(DVotes)),
         RVotes = if_else(ncp_R == 0, list(rep(0, length(RVotes))), list(RVotes)),
         IndVotes = if_else(ncp_Ind == 0, list(rep(0, length(IndVotes))), list(IndVotes)))

map_dbl(dat_results2$DVotes, length)
#> [1] 10000 10000 10000
map_dbl(dat_results2$DVotes, sum)
#> [1]         0 119172966  61629269

This could probably be simplified with across() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM