I have an issue which I am not able figure out how to solve it. I am looking for a solution to split the rows on a specific maximum that might differ per type.
The real data is more complicated than the testdata and originally the max-column is imported and combined using the dplyr LEFT_JOIN-function.
test <- data.frame(datetime = c("24/9/2020", "24/9/2020", "25/9/2020"),
type = c(1, 2, 3),
units = c(5, 8, 12),
max = c(6, 6, 4))
preferred <- data.frame(datetime = c("24/9/2020", "24/9/2020", "24/9/2020", "25/9/2020", "25/9/2020", "25/9/2020"),
type = c(1, 2, 2, 3, 3, 3),
units = c(5, 6, 2, 4, 4, 4),
max = c(6, 6, 6, 4, 4, 4))
I have tried different methods but was not able to solve it without using both while and for loops. I am sure it is possible to get it quicker and easier using specific functions, but I am not able to figure out.
Is there a way to get it to the preferred output:
> test
datetime type units max
1 24/9/2020 1 5 6
2 24/9/2020 2 8 6
3 25/9/2020 3 12 4
> preferred
datetime type units max
1 24/9/2020 1 5 6
2 24/9/2020 2 6 6
3 24/9/2020 2 2 6
4 25/9/2020 3 4 4
5 25/9/2020 3 4 4
6 25/9/2020 3 4 4
If you have any input, please let me know.
Thank you in advance!!! Much appreciated
You can use the old split-apply-bind approach, like this:
do.call(rbind, lapply(split(test, test$type), function(x) {
if(x$units > x$max)
{
x <- rbind(x[rep(1, x$units %/% x$max), ], x)
x$units[-nrow(x)] <- x$max[1]
x$units[nrow(x)] <- x$units[nrow(x)] %% x$max[1]
}
x[x$units != 0,]
}))
#> datetime type units max
#> 1 24/9/2020 1 5 6
#> 2.2 24/9/2020 2 6 6
#> 2.21 24/9/2020 2 2 6
#> 3.3 25/9/2020 3 4 4
#> 3.3.1 25/9/2020 3 4 4
#> 3.3.2 25/9/2020 3 4 4
Here's an approach building a utility function for calculating a single result, and then using dplyr
, purrr
, and tidyr
to do it by group in your data frame:
additive_components = function(max, units) {
result = rep(max, units %/% max)
if(units %% max != 0) result = c(result, units %% max)
return(result)
}
library(dplyr)
library(purrr)
test %>% group_by(type) %>%
mutate(new_units = map2(max, units, additive_components)) %>%
unnest(new_units)
# # A tibble: 6 x 5
# # Groups: type [3]
# datetime type units max new_units
# <chr> <dbl> <dbl> <dbl> <dbl>
# 1 24/9/2020 1 5 6 5
# 2 24/9/2020 2 8 6 6
# 3 24/9/2020 2 8 6 2
# 4 25/9/2020 3 12 4 4
# 5 25/9/2020 3 12 4 4
# 6 25/9/2020 3 12 4 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.