I have a column of type "list" in my data frame, I want to create a column with the sum.
I guess there is no visual difference, but my column consists of list(1,2,3)
s and not c(1,2,3)
s:
tibble(
MY_DATA = list(
list(2, 7, 8),
list(3, 10, 11),
list(4, 2, 8)
),
NOT_MY_DATA = list(
c(2, 7, 8),
c(3, 10, 11),
c(4, 2, 8)
)
)
Unfortunately when I try mutate(NEW_COL = MY_LIST_COL_D %>% unlist() %>% sum())
the result is that every cell in the new column contains the sum of the entire source column (so a value in the millions)
I tried reduce
and it did work, but was slow, I was looking for a better solution.
You could use the purrr::map_dbl
, which should return a vector of type double
:
library(tibble)
library(dplyr)
library(purrr)
df = tibble(
MY_LIST_COL_D = list(
c(2, 7, 8),
c(3, 10, 11),
c(4, 2, 8)
)
)
df %>%
mutate(NEW_COL= map_dbl(MY_LIST_COL_D, sum), .keep = 'unused')
# NEW_COL
<dbl>
# 1 17
# 2 24
# 3 14
Is this what you were looking for? If you don't want to remove the list column just disregard the .keep
argument.
Update With the underlying structure being lists, you can still apply the same logic, but one way to solve the issue is to unlist
:
df = tibble(
MY_LIST_COL_D = list(
list(2, 7, 8),
list(3, 10, 11),
list(4, 2, 8)
)
)
df %>%
mutate(NEW_COL = map_dbl(MY_LIST_COL_D, ~ sum(unlist(.x))), .keep = 'unused')
# NEW_COL
# <dbl>
# 1 17
# 2 24
# 3 14
You can use rowwise
in dplyr
library(dplyr)
df %>% rowwise() %>% mutate(NEW_COL = sum(MY_LIST_COL_D))
rowwise
will also make your attempt work:
df %>% rowwise() %>% mutate(NEW_COL = MY_LIST_COL_D %>% unlist() %>% sum())
Can also use sapply
in base R:
df$NEW_COL <- sapply(df$MY_LIST_COL_D, sum)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.