简体   繁体   中英

Names of nested list containing dots (e.g. "c.2)

How can I get the names of the leafs of a nested list (containing a dataframe)

p <- list(a=1,b=list(b1=2,b2=3),c=list(c1=list(c11='a',c12='x'),c.2=data.frame("t"=1)))

into a vector format:

[[1]]
[1] "a"
[[2]]
[1] "b" "b1"
[[3]]
[1] "b" "b2"
[[4]]
[1] "c" "c1" "c11"
[[5]]
[1] "c" "c1" "c12"
[[6]]
[1] "c" "c.2"

The problem is that my list contains names with a dot (eg "c.2"). By using unlist, one gets "c.c.2" and I (or possibly strsplit ) can't tell if the point is a delimiter of unlist or part of the name. That is the difference to this question .

It should ignore data.frames. My approach so far is adapted from here , but struggles with the points created by unlist :

listNames = function(l, maxDepth = 2) {
  n = 0
  listNames_rec = function(l, n) {
    if(!is.list(l) | is.data.frame(l) | n>=maxDepth) TRUE
    else { 
      n = n + 1
      # print(n)
      lapply(l, listNames_rec, n)
    }
  }
  n = names(unlist(listNames_rec(l, n)))
  return(n)
}
listNames(p, maxDepth = 3)
[1] "a"        "b.b1"     "b.b2"     "c.c1.c11" "c.c1.c12" "c.c.2"  

Like this?

subnames <- function(L, s) {
  if (!is.list(L) || is.data.frame(L)) return(L)
  names(L) <- gsub(".", s, names(L), fixed = TRUE)
  lapply(L, subnames, s)
}

res <- listNames(subnames(p, ":"), maxDepth = 3)
gsub(":", ".",
  gsub(".", "$", res, fixed = TRUE),
  fixed = TRUE
)
#[1] "a"        "b$b1"     "b$b2"     "c$c1$c11" "c$c1$c12" "c$c.2" 

Not a full answer but I imagine rrapply package could help you here? One option could be to extract all names:

library(rrapply)
library(dplyr)
rrapply(p, how = "melt") %>% 
  select(-value)
#   L1   L2   L3
# 1  a <NA> <NA>
# 2  b   b1 <NA>
# 3  b   b2 <NA>
# 4  c   c1  c11
# 5  c   c1  c12
# 6  c  c.2    t

The problem here is that data.frame names are included above too so you could extract them separately:

#extract data frame name
rrapply(p, classes = "data.frame", how = "melt") %>% 
  select(-value)
#   L1  L2
# 1  c c.2

Then you could play around with these two datasets and perhaps extract duplicates but keep dataframe names

rrapply(p, how = "melt") %>%  
  bind_rows(rrapply(p, classes = "data.frame", how = "melt")) 
  #then filter etc...

A way might be:

listNames = function(l, n, N) {
  if(!is.list(l) | is.data.frame(l) | n<1) list(rev(N))
  else unlist(Map(listNames, l, n=n-1, N=lapply(names(l), c, N)), FALSE, FALSE)
}

listNames(p, 3, NULL)
#[[1]]
#[1] "a"
#
#[[2]]
#[1] "b"  "b1"
#
#[[3]]
#[1] "b"  "b2"
#
#[[4]]
#[1] "c"   "c1"  "c11"
#
#[[5]]
#[1] "c"   "c1"  "c12"
#
#[[6]]
#[1] "c"   "c.2"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM