简体   繁体   English

如何计算每行每列的周期总和并将其除以 R 中的总最小值

[英]How to calculate the sum of periods over each column for each row and divide it by a total minimum value in R

I have a calculation where I accumulate the sum of each flower in the table below for each year.我有一个计算方法,我在下表中累积每年每朵花的总和。 However, I would like to use the same calculation to also divide the accumulated values by a total minimum value.但是,我想使用相同的计算也将累积值除以总最小值。

(Data Frame) (数据框)

 dat <- data.frame(
  stringsAsFactors = FALSE,
  flower = c("lily", "rose", "daisy"),
  x1902 = c(23L, 50L, 30L),
  x1950 = c(23L, 110L, 37L),
  x2010 = c( 23L, 115L, 47L),
  x2012 = c( 31L, 131L, 49L),
  x2021 = c( 36L, 131L, 49L),
  total_min = c(5L, 5L, 2L)

The code that I used to do the accumulation as follows:我用来做积累的代码如下:

dat[, 2:6] <- t(apply(dat[,2:6], 1, cumsum))

So now I need to divide the accumulated amount by the total min for each flower type for each year.所以现在我需要将累计金额除以每年每种花型的总最小值。 I tried the code as follows but I am not getting the correct results:我尝试了如下代码,但没有得到正确的结果:

dat[,2:6] <- t(apply(dat[,2:6], 1, cumsum)/dat$total_min)

(Table 1 - DataFrame) (表 1 - 数据帧)

flower x1902 x1902 x1950 x1950 x2010 x2010 x2012 x2012 x2021 x2021 total_min总分钟数
lily百合 23 23 0 0 0 0 8 8 5 5 5 5
rose玫瑰 50 50 60 60 5 5 16 16 0 0 5 5
daisy雏菊 30 30 7 7 10 10 2 2 0 0 2 2

Calculating the sum for each flower in each year.计算每年每朵花的总和。 The end result gives me:最终结果给了我:

(Table 2 - Accumulated results ) (表2-累积结果)

flower x1902 x1902 x1950 x1950 x2010 x2010 x2012 x2012 x2021 x2021
lily百合 23 23 23 23 23 23 31 31 36 36
rose玫瑰 50 50 110 110 115 115 131 131 131 131
daisy雏菊 30 30 37 37 47 47 49 49 49 49

The final result should look like Table 3最终结果应如表 3 所示

(Table 3 - expected results) (表 3 - 预期结果)

flower x1902 x1902 x1950 x1950 x2010 x2010 x2012 x2012 x2021 x2021
lily百合 4.6 4.6 4.6 4.6 4.6 4.6 6.2 6.2 7.2 7.2
rose玫瑰 10 10 22 22 23 23 26.2 26.2 26.2 26.2
daisy雏菊 15 15 18.5 18.5 23.5 23.5 24.5 24.5 24.5 24.5

The problems with the code in the question are:问题中代码的问题是:

  • the final ) is in the wrong place最后 ) 放错地方了
  • although not wrong it would be less error prone to create a new version of the data with a different name.虽然没有错,但创建具有不同名称的数据的新版本更不容易出错。
  • also it would be a bit more robust if we extract the columns whose names start with x rather than hard coding 2:6.如果我们提取名称以x开头的列,而不是硬编码 2:6,它也会更加健壮。

1) Using dat0 shown in the Note at the end we have: 1)使用最后注释中显示的dat0 ,我们有:

xnms <- startsWith(names(dat0), "x")
dat2 <- replace(dat0, xnms, t(apply(dat0[xnms], 1, cumsum)) / dat0$total_min)
dat2
##   flower x1902 x1950 x2010 x2012 x2021 total_min
## 1   lily   4.6   4.6   4.6   6.2   7.2         5
## 2   rose  10.0  22.0  23.0  26.2  26.2         5
## 3  daisy  15.0  18.5  23.5  24.5  24.5         2

2) Another approach is Reduce : 2)另一种方法是Reduce

replace(dat0, xnms, 
  as.data.frame(Reduce("+", dat0[xnms], acc = TRUE)) / dat0$total_min)

Note笔记

dat0 <-
structure(list(flower = c("lily", "rose", "daisy"), x1902 = c(23L, 
50L, 30L), x1950 = c(0L, 60L, 7L), x2010 = c(0L, 5L, 10L), x2012 = c(8L, 
16L, 2L), x2021 = c(5L, 0L, 0L), total_min = c(5L, 5L, 2L)),
class = "data.frame", row.names = c(NA, -3L))

dat0
##   flower x1902 x1950 x2010 x2012 x2021 total_min
## 1   lily    23     0     0     8     5         5
## 2   rose    50    60     5    16     0         5
## 3  daisy    30     7    10     2     0         2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM