简体   繁体   English

根据行总和转换列

[英]Transform columns based on row sums

I would like to convert my data frame first by summing each row.我想首先通过对每一行求和来转换我的数据框。 So for example, in the case of row that starts with 10, I would like to sum up the row values.例如,对于以 10 开头的行,我想对行值求和。 A second step I'd like to take is to use the row sum value to represent how many times the first value appeared.我想采取的第二步是使用行总和值来表示第一个值出现的次数。 Any suggestions有什么建议

Sample data:样本数据:

structure(list(X1 = c(10, 20, 30, 40), `04:00` = c(1, 0, 0, 1
), `04:10` = c(1, 0, 0, 1), `04:20` = c(1, 0, 0, 1), `04:30` = c(1, 
0, 0, 1), `04:40` = c(2, 0, 0, 1), `04:50` = c(1, 0, 0, 1), `05:00` = c(3, 
0, 0, 1), `05:10` = c(1, 1, 1, 1), `05:20` = c(5, 0, 0, 1), Sum = c(16, 
1, 1, 9)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -4L), spec = structure(list(cols = list(
    X1 = structure(list(), class = c("collector_double", "collector"
    )), `04:00` = structure(list(), class = c("collector_double", 
    "collector")), `04:10` = structure(list(), class = c("collector_double", 
    "collector")), `04:20` = structure(list(), class = c("collector_double", 
    "collector")), `04:30` = structure(list(), class = c("collector_double", 
    "collector")), `04:40` = structure(list(), class = c("collector_double", 
    "collector")), `04:50` = structure(list(), class = c("collector_double", 
    "collector")), `05:00` = structure(list(), class = c("collector_double", 
    "collector")), `05:10` = structure(list(), class = c("collector_double", 
    "collector")), `05:20` = structure(list(), class = c("collector_double", 
    "collector")), Sum = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"))

Original data原始数据

在此处输入图片说明

Desired output期望输出

在此处输入图片说明

Here's a long winded tidyverse approach -这是一个冗长的tidyverse方法 -

library(dplyr)
library(tidyr)

df %>%
  select(X1, Sum) %>%
  uncount(max(Sum)) %>%
  group_by(X1) %>% 
  mutate(col = ifelse(row_number() <= first(Sum), X1, NA), 
         row = row_number()) %>% 
  ungroup %>%
  select(-Sum) %>%
  pivot_wider(names_from = X1, values_from = col) %>%
  select(-row)

#    `10`  `20`  `30`  `40`
#   <dbl> <dbl> <dbl> <dbl>
# 1    10    20    30    40
# 2    10    NA    NA    40
# 3    10    NA    NA    40
# 4    10    NA    NA    40
# 5    10    NA    NA    40
# 6    10    NA    NA    40
# 7    10    NA    NA    40
# 8    10    NA    NA    40
# 9    10    NA    NA    40
#10    10    NA    NA    NA
#11    10    NA    NA    NA
#12    10    NA    NA    NA
#13    10    NA    NA    NA
#14    10    NA    NA    NA
#15    10    NA    NA    NA
#16    10    NA    NA    NA
mapply(\(x, n) {out <- rep(x, n); length(out) <- max(df$Sum); out}, df$X1, df$Sum)
#       [,1] [,2] [,3] [,4]
#  [1,]   10   20   30   40
#  [2,]   10   NA   NA   40
#  [3,]   10   NA   NA   40
#  [4,]   10   NA   NA   40
#  [5,]   10   NA   NA   40
#  [6,]   10   NA   NA   40
#  [7,]   10   NA   NA   40
#  [8,]   10   NA   NA   40
#  [9,]   10   NA   NA   40
# [10,]   10   NA   NA   NA
# [11,]   10   NA   NA   NA
# [12,]   10   NA   NA   NA
# [13,]   10   NA   NA   NA
# [14,]   10   NA   NA   NA
# [15,]   10   NA   NA   NA
# [16,]   10   NA   NA   NA

One more approach through purrr通过purrr一种方法

df <- structure(list(X1 = c(10, 20, 30, 40), `04:00` = c(1, 0, 0, 1
), `04:10` = c(1, 0, 0, 1), `04:20` = c(1, 0, 0, 1), `04:30` = c(1, 
                                                                 0, 0, 1), `04:40` = c(2, 0, 0, 1), `04:50` = c(1, 0, 0, 1), `05:00` = c(3, 
                                                                                                                                         0, 0, 1), `05:10` = c(1, 1, 1, 1), `05:20` = c(5, 0, 0, 1), Sum = c(16, 
                                                                                                                                                                                                             1, 1, 9)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
                                                                                                                                                                                                             ), row.names = c(NA, -4L), spec = structure(list(cols = list(
                                                                                                                                                                                                               X1 = structure(list(), class = c("collector_double", "collector"
                                                                                                                                                                                                               )), `04:00` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                         "collector")), `04:10` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                              "collector")), `04:20` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                   "collector")), `04:30` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                        "collector")), `04:40` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             "collector")), `04:50` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  "collector")), `05:00` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "collector")), `05:10` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            "collector")), `05:20` = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 "collector")), Sum = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  "collector"))), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        "collector")), skip = 1L), class = "col_spec"))


library(tidyverse)
#deleting your dummy column
df$Sum <- NULL

map2_dfc(rowSums(select(df, !'X1')), df$X1, ~  
       c(rep(.y, .x), rep(NA, max(rowSums(select(df, !'X1'))) - .x)) %>% 
         as.data.frame() %>% 
       set_names(.y))

#>    10 20 30 40
#> 1  10 20 30 40
#> 2  10 NA NA 40
#> 3  10 NA NA 40
#> 4  10 NA NA 40
#> 5  10 NA NA 40
#> 6  10 NA NA 40
#> 7  10 NA NA 40
#> 8  10 NA NA 40
#> 9  10 NA NA 40
#> 10 10 NA NA NA
#> 11 10 NA NA NA
#> 12 10 NA NA NA
#> 13 10 NA NA NA
#> 14 10 NA NA NA
#> 15 10 NA NA NA
#> 16 10 NA NA NA

Created on 2021-06-19 by the reprex package (v2.0.0)reprex 包( v2.0.0 ) 于 2021 年 6 月 19 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM