简体   繁体   中英

Create nested list structure from a string

I have a string that is a composite of n substrings. It could look like this:

string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")

Every subcomponent in this string is separated from any other by "_". Here, the first level consists of the values "A" and "B", the second level of "AA", "BB" and "CC", the third level of "AAA". Deeper nestings are possible and the solution should extend to those cases. The nestings are not necessarily balanced, eg "A" only has two children, while "B" has three, but it also has a grandchild which "B" has not.

Essentially, I want to recreate the nested structure in this string in some kind of R object, preferably a list. Thus, the nested list structure that would look like this:

list("A" = list("AA", "BB" = list("AAA")),
"B" = list("AA", "BB", "CC"))

> $A
  $A[[1]]

  [1] "AA"
  $A$BB
  $A$BB[[1]]
  [1] "CCC"

  $B
  $B[[1]]
  [1] "AA"

  $B[[2]]
  [1] "BB"

  $B[[3]]
  [1] "CC"

Any help on this is appreciated

You can make it into a matrix without too much fuss...

string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")

splitted<-strsplit(string,"_")
cols<-max(lengths(splitted))
mat<-do.call(rbind,lapply(splitted, "length<-", cols))

Not so straight forward, also not the most beautiful code, but it should do its job and return a list:

string <- c("A_AA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")

# loop through each element of the string "str_el"
list_els <- lapply(string, function(str_el) {

  # split the string into parts
  els <- strsplit(str_el, "_")[[1]]

  # loop backwards through the elements
  for (i in length(els):1){

    # the last element gives the value
    if (i == length(els)){

      # assign the value to a list and rename the list          
      res <- list(els[[i]])
      names(res) <- els[[i - 1]]

    } else {
      # if its not the last element (value) assign the list res to another list
      # with the name of that element
      if (i != 1) {
        res <- list(res)
        names(res) <- els[[i - 1]]
      }
    }
  }

  return(res)
})

# combine the lists into one list
res_list <- mapply(c, list_els, SIMPLIFY = F)

res_list
# [[1]]
# [[1]]$A
# [1] "AA"
# 
# 
# [[2]]
# [[2]]$A
# [1] "BB"
# 
# 
# [[3]]
# [[3]]$A
# [[3]]$A$BB
# [1] "AAA"
# 
# 
# 
# [[4]]
# [[4]]$B
# [1] "AA"
# 
# 
# [[5]]
# [[5]]$B
# [1] "BB"
# 
# 
# [[6]]
# [[6]]$B
# [1] "CC"

Does that give you what you want?

I found this way to do it. It's weird, but seems to work

my_relist <- function(x){
y=list()
#This first loop creates the skeleton of the list
for (name in x){
    split=strsplit(name,'_',fixed=TRUE)[[1]]
    char='y'
    l=length(split)
    for (i in 1:(l-1)){
        char=paste(char,'$',split[i],sep="")
    }
char2=paste(char,'= list()',sep="")
#Example of char2: "y$A$BB=list()"
eval(parse(text=char2))
#Evaluates the expression inside char2
}

#The second loop fills the list with the last element
for (name in x){
   split=strsplit(name,'_',fixed=TRUE)[[1]]
   char='y'
   l=length(split)
   for (i in 1:(l-1)){
       char=paste(char,'$',split[i],sep="")
   }
char3=paste(char,'=c(',char,',split[l])')
#Example of char3: "y$A = c(y$A,"BB")"
eval(parse(text=char3))
}
return(y)
}

And this is the result:

example <- c("A_AA_AAA", "A_BB", "A_BB_AAA", "B_AA", "B_BB", "B_CC")
my_relist(example)
#$A
#$BB
#1.'AAA'
#[[2]]
#'AA'
#[[3]]
#'BB'
#$B
#1.'AA'
#2.'BB'
#3.'CC'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM