My problem is a bit tricky: I'm working on data edition and I'm close to finding the right solution. Got a dataframe like this:
ID name var1 var2 var3 var_total
1 a 1 1 2 4
2 b 2 3 2 7
3 c 1 -1 -1 1
Where var_total
is the sum from var1 to var3 of each number that is higher than zero. Say, on ID == 2 I needed to change var2 to -1, doing this:
df %>% mutate(var2 = if_else(ID == 2, -1, var2))
Which brings this:
ID name var1 var2 var3 var_total
1 a 1 1 2 4
2 b 2 -1 2 7
3 c 1 -1 -1 1
The problem is, I need to find a way to automatically re-calculate var_total
for that row. I know how to do it for the whole dataframe, but that's a bit slow:
df %>%
rowwise() %>%
mutate(var_total = {
x <- c_across(starts_with('var'))
sum(x[x > 0])
})
Is there any way to perform this operation only on the selected ID
? In this case, my final dataframe would be:
ID name var1 var2 var3 var_total
1 a 1 1 2 4
2 b 2 -1 2 4
3 c 1 -1 -1 1
Thanks!
If you want to efficiently update a single row (or small subset of rows) I would use direct assignment, not dplyr
.
var_cols = grep(names(df), pattern = "var[0-9]+", value = T)
recalc_id = 2
df[df$ID %in% recalc_id, "var_total"] = apply(df[df$ID %in% recalc_id, var_cols], 1, \(x) sum(x[x > 0]))
As akrun points out in comments, if it is just a single row, the apply
can be skipped:
i = which(df$ID == recalc_id)
row = unlist(df[i, var_cols])
df$var_total[i] = sum(row[row > 0])
Here's the same thing with dplyr::case_when
, for a dplyr
solution:
df = df %>%
rowwise() %>%
mutate(var_total = case_when(
ID %in% 2 ~{
x <- c_across(starts_with('var[0-9]+'))
sum(x[x > 0])
},
TRUE ~ var_total
)
)
(Note that in both cases we need to change the column name pattern to not include var_total
in the sum.)
rowwise
breaks some vectorization and slows things down, so if you are so concerned about efficiency that recalculating the sum is "too slow", I'd strongly recommend the base
solution. You might even find a non-conditional base solution to be plenty fast enough for this row-wise operation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.