简体   繁体   中英

Join across/within nested dataframes

I am pulling information out of a model for eventual plotting. My desired plots are jittered original data with an overlay of mean +/- STDERR and text groupings. The model outputs put the groupings and estimates in separate dataframes within a list. I'm using map to extract those and it works, but I'm stuck with the step of joining them together.

I want to join two nested list-cols into a single table and nest that result as a new column. Best I can do currently is to unnest, join tables, nest again, and join back to original nested table.

library(agricolae)
library(tidyverse)

fitHSD2<- function(d) HSD.test(aov(mpg ~ cyl, data= d), trt = "cyl")     # anova with Tukey HSD

carnestdf <-
    mtcars %>%
        group_by(gear) %>%
        nest() %>%
        mutate(mod = map(data, fitHSD2) # fit model
                        , estimates = map(mod, function(df) return(df$means)) # pull out estimates and StdErr
                        , estimates = map(estimates, function(df) return(rownames_to_column(df, var = "trt"))) #attach rownames as column for unnest
                        , grouping = map(mod, function(df) return(df$groups)) # pull out groupings
                        , grouping = map(grouping, function(df) mutate(df, trt = as.character(trt) # convert to character
                                                                                                        , trt = gsub("[[:space:]]*$", "", trt)
                                                                                                        , M = as.character(M)
                                                                                                        )
                                                    ) # remove whitespace at end for join
                        ) 

carnestdf

I can unnest each one and join them, but I can't nest and join them back. I can in fact... just need to define the join key otherwise it tries to join based upon the nested DF and that doesn't work without the hashing below.

full_join(unnest(carnestdf , estimates), unnest(carnestdf , grouping)) %>%
group_by(gear) %>%
nest(.key = "estgrp") %>%
full_join(carnestdf, ., by = "gear")

I found this: R: Join two tables (tibbles) by *list* columns

But it doesn't seem to work, I get the same error when using the hash to join. It does work, needed to define the .key in nest so it wasn't "data". Would still prefer to join without unnesting... :/

nestmerge <-
    full_join(unnest(carnestdf , estimates), unnest(carnestdf , grouping)) %>%
    group_by(gear) %>%
    nest(.key = "mergedestgrp") %>%
    mutate_all(funs(hash = map_chr(., digest::digest)))

carnestdf %>%
    mutate_all(funs(hash = map_chr(., digest::digest))) %>%
    full_join(., nestmerge) %>%
    select(-ends_with("hash"))

The answer apparently is map2:

carnestdf <-
    mtcars %>%
        group_by(gear) %>%
        nest() %>%
        mutate(mod = map(data, fitHSD2) # fit model
                        , estimates = map(mod, function(df) return(df$means)) # pull out estimates and StdErr
                        , estimates = map(estimates, function(df) return(rownames_to_column(df, var = "trt"))) #attach rownames as column for unnest
                        , grouping = map(mod, function(df) return(df$groups)) # pull out groupings
                        , grouping = map(grouping, function(df) mutate(df, trt = as.character(trt) # convert to character
                                                                                                        , trt = gsub("[[:space:]]*$", "", trt)
                                                                                                        , M = as.character(M)
                                                                                                        )
                                                    ) # remove whitespace at end for join
                        , estgrp = map2(estimates, grouping, ~full_join(.x, .y, by = "trt"))
                        ) 

carnestdf

This does a full join on the two tables by "trt" and makes a new list column with the result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM