[英]Transform columns based on row sums
I would like to convert my data frame first by summing each row.我想首先通过对每一行求和来转换我的数据框。 So for example, in the case of row that starts with 10, I would like to sum up the row values.
例如,对于以 10 开头的行,我想对行值求和。 A second step I'd like to take is to use the row sum value to represent how many times the first value appeared.
我想采取的第二步是使用行总和值来表示第一个值出现的次数。 Any suggestions
有什么建议
Sample data:样本数据:
structure(list(X1 = c(10, 20, 30, 40), `04:00` = c(1, 0, 0, 1
), `04:10` = c(1, 0, 0, 1), `04:20` = c(1, 0, 0, 1), `04:30` = c(1,
0, 0, 1), `04:40` = c(2, 0, 0, 1), `04:50` = c(1, 0, 0, 1), `05:00` = c(3,
0, 0, 1), `05:10` = c(1, 1, 1, 1), `05:20` = c(5, 0, 0, 1), Sum = c(16,
1, 1, 9)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), spec = structure(list(cols = list(
X1 = structure(list(), class = c("collector_double", "collector"
)), `04:00` = structure(list(), class = c("collector_double",
"collector")), `04:10` = structure(list(), class = c("collector_double",
"collector")), `04:20` = structure(list(), class = c("collector_double",
"collector")), `04:30` = structure(list(), class = c("collector_double",
"collector")), `04:40` = structure(list(), class = c("collector_double",
"collector")), `04:50` = structure(list(), class = c("collector_double",
"collector")), `05:00` = structure(list(), class = c("collector_double",
"collector")), `05:10` = structure(list(), class = c("collector_double",
"collector")), `05:20` = structure(list(), class = c("collector_double",
"collector")), Sum = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
Original data原始数据
Desired output期望输出
Here's a long winded tidyverse
approach -这是一个冗长的
tidyverse
方法 -
library(dplyr)
library(tidyr)
df %>%
select(X1, Sum) %>%
uncount(max(Sum)) %>%
group_by(X1) %>%
mutate(col = ifelse(row_number() <= first(Sum), X1, NA),
row = row_number()) %>%
ungroup %>%
select(-Sum) %>%
pivot_wider(names_from = X1, values_from = col) %>%
select(-row)
# `10` `20` `30` `40`
# <dbl> <dbl> <dbl> <dbl>
# 1 10 20 30 40
# 2 10 NA NA 40
# 3 10 NA NA 40
# 4 10 NA NA 40
# 5 10 NA NA 40
# 6 10 NA NA 40
# 7 10 NA NA 40
# 8 10 NA NA 40
# 9 10 NA NA 40
#10 10 NA NA NA
#11 10 NA NA NA
#12 10 NA NA NA
#13 10 NA NA NA
#14 10 NA NA NA
#15 10 NA NA NA
#16 10 NA NA NA
mapply(\(x, n) {out <- rep(x, n); length(out) <- max(df$Sum); out}, df$X1, df$Sum)
# [,1] [,2] [,3] [,4]
# [1,] 10 20 30 40
# [2,] 10 NA NA 40
# [3,] 10 NA NA 40
# [4,] 10 NA NA 40
# [5,] 10 NA NA 40
# [6,] 10 NA NA 40
# [7,] 10 NA NA 40
# [8,] 10 NA NA 40
# [9,] 10 NA NA 40
# [10,] 10 NA NA NA
# [11,] 10 NA NA NA
# [12,] 10 NA NA NA
# [13,] 10 NA NA NA
# [14,] 10 NA NA NA
# [15,] 10 NA NA NA
# [16,] 10 NA NA NA
One more approach through purrr
通过
purrr
一种方法
df <- structure(list(X1 = c(10, 20, 30, 40), `04:00` = c(1, 0, 0, 1
), `04:10` = c(1, 0, 0, 1), `04:20` = c(1, 0, 0, 1), `04:30` = c(1,
0, 0, 1), `04:40` = c(2, 0, 0, 1), `04:50` = c(1, 0, 0, 1), `05:00` = c(3,
0, 0, 1), `05:10` = c(1, 1, 1, 1), `05:20` = c(5, 0, 0, 1), Sum = c(16,
1, 1, 9)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), spec = structure(list(cols = list(
X1 = structure(list(), class = c("collector_double", "collector"
)), `04:00` = structure(list(), class = c("collector_double",
"collector")), `04:10` = structure(list(), class = c("collector_double",
"collector")), `04:20` = structure(list(), class = c("collector_double",
"collector")), `04:30` = structure(list(), class = c("collector_double",
"collector")), `04:40` = structure(list(), class = c("collector_double",
"collector")), `04:50` = structure(list(), class = c("collector_double",
"collector")), `05:00` = structure(list(), class = c("collector_double",
"collector")), `05:10` = structure(list(), class = c("collector_double",
"collector")), `05:20` = structure(list(), class = c("collector_double",
"collector")), Sum = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))
library(tidyverse)
#deleting your dummy column
df$Sum <- NULL
map2_dfc(rowSums(select(df, !'X1')), df$X1, ~
c(rep(.y, .x), rep(NA, max(rowSums(select(df, !'X1'))) - .x)) %>%
as.data.frame() %>%
set_names(.y))
#> 10 20 30 40
#> 1 10 20 30 40
#> 2 10 NA NA 40
#> 3 10 NA NA 40
#> 4 10 NA NA 40
#> 5 10 NA NA 40
#> 6 10 NA NA 40
#> 7 10 NA NA 40
#> 8 10 NA NA 40
#> 9 10 NA NA 40
#> 10 10 NA NA NA
#> 11 10 NA NA NA
#> 12 10 NA NA NA
#> 13 10 NA NA NA
#> 14 10 NA NA NA
#> 15 10 NA NA NA
#> 16 10 NA NA NA
Created on 2021-06-19 by the reprex package (v2.0.0)由reprex 包( v2.0.0 ) 于 2021 年 6 月 19 日创建
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.