简体   繁体   中英

Format a list of lists of dataframes

I am trying to learn purr and so I look for solutions related to it. Suppose I have a list with 3 elements, whose elements are themselves lists of dataframes:

a1 <- data.frame(a = c('alfa', 'beta', 'omega'), b = rnorm(3,0,1), c = NA)
a2 <- data.frame(a = c('lambda', 'delta', 'epsilon'), b = rnorm(3,0, 1), c = NA)
b1 <- data.frame(a = c('lambda', 'delta', 'alfa'), b = rnorm(3, 1, 1), c = 1)
b2 <- data.frame(a = c('beta', 'delta', 'epsilon'), b = rnorm(3, 1, 2), c = c(0, 1, NA))

a <- list(a1, a2)
b <- list(b1, b2)

L <- list(a,b)

How can I format L using map*_ in such a way that all first columns all converted to character (I do not need to know a general case in which any column could have factors) and also such that NA s are removed?

Since there is more than one level of nesting, I don't know how to call functions without unnesting anything.

One dplyr and purrr option could be:

map_depth(.x = L, 2, ~ .x %>%
           mutate_at(1, as.character) %>%
           na.omit())

[[1]]
[[1]][[1]]
[1] a b c
<0 rows> (or 0-length row.names)

[[1]][[2]]
[1] a b c
<0 rows> (or 0-length row.names)


[[2]]
[[2]][[1]]
       a         b c
1 lambda 0.6691767 1
2  delta 1.5106571 1
3   alfa 1.8121246 1

[[2]][[2]]
      a          b c
1  beta -0.4429880 0
2 delta -0.7539317 1

For the sake of completeness and for future users who may prefer to stick with base for whatever reason, here is a base solution that assumes all data frames will have the same names:

    lapply(L,lapply,function(x) na.omit(within(x,
                                   { a <-as.character(a)})))
[[1]]
[[1]][[1]]
[1] a b c
<0 rows> (or 0-length row.names)

[[1]][[2]]
[1] a b c
<0 rows> (or 0-length row.names)


[[2]]
[[2]][[1]]
       a          b c
1 lambda -0.7389969 1
2  delta  0.9791327 1
3   alfa  1.4097145 1

[[2]][[2]]
      a          b c
1  beta -0.3176996 0
2 delta  2.8242954 1

Alternatively, if the names differ across dataframes,

 lapply(L,lapply,function(x) na.omit(replace(x,1,as.character(x[,1]))))

NOTE : This is probably less flexible than the purrr solution since you have no control over the indexing level(ie for purrr you can decide to go up to a certain depth) There might be a way with rapply or lapply itself but I'm not aware of one.

Structure of the result:

   List of 2
 $ :List of 2
  ..$ :'data.frame':    0 obs. of  3 variables:
  .. ..$ a: chr(0) 
  .. ..$ b: num(0) 
  .. ..$ c: logi(0) 
  .. ..- attr(*, "na.action")= 'omit' Named int [1:3] 1 2 3
  .. .. ..- attr(*, "names")= chr [1:3] "1" "2" "3"
  ..$ :'data.frame':    0 obs. of  3 variables:
  .. ..$ a: chr(0) 
  .. ..$ b: num(0) 
  .. ..$ c: logi(0) 
  .. ..- attr(*, "na.action")= 'omit' Named int [1:3] 1 2 3
  .. .. ..- attr(*, "names")= chr [1:3] "1" "2" "3"
 $ :List of 2
  ..$ :'data.frame':    3 obs. of  3 variables:
  .. ..$ a: chr [1:3] "lambda" "delta" "alfa"
  .. ..$ b: num [1:3] -0.739 0.979 1.41
  .. ..$ c: num [1:3] 1 1 1
  ..$ :'data.frame':    2 obs. of  3 variables:
  .. ..$ a: chr [1:2] "beta" "delta"
  .. ..$ b: num [1:2] -0.318 2.824
  .. ..$ c: num [1:2] 0 1
  .. ..- attr(*, "na.action")= 'omit' Named int 3
  .. .. ..- attr(*, "names")= chr "3"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM