It is possible to flatten lists of lists using unlist(list, recursive = FALSE)
, as was shown in this question. This action concatenates list names using the default dot ( .
) separator, which is standard for variable naming in R. A simple example illustrates this:
# Create example list, l
> l <- list("a" = list("x" = 1, "y" = 2), "b" = list("x" = 3, "y" = 4))
> l
$a
$a$x
[1] 1
$a$y
[1] 2
$b
$b$x
[1] 3
$b$y
[1] 4
# Unlist lists in l
> l.unlisted <- unlist(l, recursive = FALSE)
> l.unlisted
$a.x
[1] 1
$a.y
[1] 2
$b.x
[1] 3
$b.y
[1] 4
In spite of the standard naming convention, I want the names to have a different separator ( _
). It is possible to do this through string manipulation by using sub
to find and replace the default .
separator in each name after concatenation has already taken place once in unlist()
, as follows:
> names(l.unlisted) <- sub('.', '_', names(l.unlisted), fixed=TRUE)
> l.unlisted
$a_x
[1] 1
$a_y
[1] 2
$b_x
[1] 3
$b_y
[1] 4
While this would be sufficient in most situations, I think that the extra concatenation step can be eliminated by altering the default separator used by unlist()
. I hypothesize that this can be done by altering the source code of the function using fix()
by adding a sep
argument similar to the the one used in paste()
. However, I do not know how to do so, as unlist()
is an internal function.
Is there a way to alter the default name concatenation separator in unlist()
, and how can this be done?
While one can search replace dots as suggesting in comment by akrun, this is a hack solution that doesn't necessarily work if there's dots in the names already. Here's a more robust solution.
An example list:
ex_list = list(
a = c(x1=1, x2=2, x3=3),
b = c(y1=1, y2=2),
c = c(z1=1)
)
looks like:
> ex_list
$a
x1 x2 x3
1 2 3
$b
y1 y2
1 2
$c
z1
1
The usual approaches:
> #tries
> unlist(ex_list)
a.x1 a.x2 a.x3 b.y1 b.y2 c.z1
1 2 3 1 2 1
> do.call(what = c, args = ex_list)
a.x1 a.x2 a.x3 b.y1 b.y2 c.z1
1 2 3 1 2 1
> unlist(unname(ex_list))
x1 x2 x3 y1 y2 z1
1 2 3 1 2 1
First two joins using dot ( .
) separator, third uses no prefix (useful in some cases).
A function:
#with custom separator
unlist2 = function(x, sep = "_") {
#save top names
top_names = names(x)
x = unname(x)
#flatten
x2 = unlist(x)
#add prefix
#determine how many prefixes to add of each
lengths_top = sapply(x, length)
prefixes = rep(top_names, times = lengths_top)
names(x2) = paste0(prefixes, sep, names(x2))
x2
}
Test it:
> #tests
> unlist2(ex_list)
a_x1 a_x2 a_x3 b_y1 b_y2 c_z1
1 2 3 1 2 1
> unlist2(ex_list, sep = "-")
a-x1 a-x2 a-x3 b-y1 b-y2 c-z1
1 2 3 1 2 1
unlist()
The base R function calls .Internal
, so we can't modify it easily:
> unlist
function (x, recursive = TRUE, use.names = TRUE)
{
if (.Internal(islistfactor(x, recursive))) {
lv <- unique(.Internal(unlist(lapply(x, levels), recursive,
FALSE)))
nm <- if (use.names)
names(.Internal(unlist(x, recursive, use.names)))
res <- .Internal(unlist(lapply(x, as.character), recursive,
FALSE))
res <- match(res, lv)
structure(res, levels = lv, names = nm, class = "factor")
}
else .Internal(unlist(x, recursive, use.names))
}
<bytecode: 0x558a410998b0>
<environment: namespace:base>
According to docs for .Internal
:
Only true R wizards should even consider using this function, and only R developers can add to the list of internal functions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.