简体   繁体   中英

Getting the set of nodes connected till the main parent node in R

I have a data set which has 6 rows and 3 columns. The first column represents children, whereas second column onward immediate parents of the corresponding child is allocated. 在此处输入图片说明

Above, one can see that "a" and "b" don't have any parents. whereas "c" has only parent and that is "a". "d" has parents "b" and "c" and so on.

What I need is: if given the input as the child, it should give me all the ancestors of that child including child.

eg "f" is the child I chose then desired output should be : {"f", "d", "b"}, {"f", "d", "c", "a"}, {"f", "e", "b"}, {"f", "e", "c", "a"}.

Note: Order of the nodes does not matter.

Thank you so much in advance.

Create sample data. Note use of stringsAsFactors here, I'm assuming your data are characters and not factors:

> d <- data.frame(list("c" = c("a", "b", "c", "d", "e", "f"), "p1" = c(NA, NA, "a", "b", "b", "d"), "p2" = c(NA, NA, NA, "c", "c", "e")),stringsAsFactors=FALSE)

First tidy it up - make the data long, not wide, with each row being a child-parent pair:

> pairs = subset(reshape2::melt(d,id.vars="c",value.name="parent"), !is.na(parent))[,c("c","parent")]
> pairs
   c parent
3  c      a
4  d      b
5  e      b
6  f      d
10 d      c
11 e      c
12 f      e

Now we can make a graph of the parent-child relationships. This is a directed graph, so plots child-parent as an arrow:

> g = graph.data.frame(pairs)
> plot(g)

在此处输入图片说明

Now I'm not sure exactly what you want, but igraph functions can do anything... So for example, here's a search of the graph starting at d from which we can get various bits of information:

> d_search = bfs(g,"d",neimode="out", unreachable=FALSE, order=TRUE, dist=TRUE)

First, which nodes are ancestors of d ? Its the ones that can be reached from d via the exhaustive (here, breadth-first) search:

> d_search$order
+ 6/6 vertices, named:
[1] d    c    b    a    <NA> <NA>

Note it includes d as well. Trivial enough to drop from this list. That gives you the set of ancestors of d which is what you asked for.

What is the relationship of those nodes to d ?

> d_search$dist
  c   d   e   f   a   b 
  1   0 NaN NaN   2   1

We see that e and f are unreachable, so are not ancestors of d . c and b are direct parents, and a is a grandparent. You can check this from the graph.

You can also get all the paths from any child upwards using functions like shortest_paths and so on.

Here is a recursive function that makes all possible family lines:

d <- data.frame(list("c" = c("a", "b", "c", "d", "e", "f"), 
      "p1" = c(NA, NA, "a", "b", "b", "d"), 
      "p2" = c(NA, NA, NA, "c", "c", "e")), stringsAsFactors = F)

# Make data more convenient for the task.
library(reshape2)
dp <-  melt(d, id = c("c"), value.name = "p") 

# Recursive function builds ancestor vectors.
getAncestors <- function(data, x, ancestors = list(x)) {

  parents <- subset(data, c %in% x & !is.na(p), select = c("c", "p"))

  if(nrow(parents) == 0) {
    return(ancestors)
  }

  x.c <- parents$c
  p.c <- parents$p

  ancestors <- lapply(ancestors, function(x) {
    if (is.null(x)) return(NULL)

    # Here we want to repeat ancestor chain for each new parent.
    res <- list()
    matches <- 0
    for (i in 1:nrow(parents)) {
      if (tail(x, 1) == parents[i, ]$c){
       res[[i]] <- c(x, parents[i, ]$p)
       matches <- matches + 1
      }
    }

    if (matches == 0) { # There are no more parents. 
      res[[1]] <- x
    }

    return (res)
  })

  # remove one level of lists.
  ancestors <- unlist(ancestors, recursive = F)

  res <- getAncestors(data, p.c, ancestors)
  return (res)

}

# Demo of results for the lowest level.
res <- getAncestors(dp, "f")
res
#[[1]]
#[1] "f" "d" "b"

#[[2]]
#[1] "f" "d" "c" "a"

#[[3]]
#[1] "f" "e" "b"

#[[4]]
#[1] "f" "e" "c" "a"

You will need to implement this in a similar way through recursion or with a while loop.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM