[英]How to normalize time series data in R?
I have the matrix below. 我有下面的矩阵。 How do I divide each row with its mean? 如何将每一行除以均值?
TAXA 1992 1993 1994 1995
Aba 1 0 0.01 0
Abr 2 0.084 0.1 3
Amp 7 6 4 2
I think you want either of these - 我想您想要这些-
For a data frame: 对于数据框:
cbind(df[1], df[-1] / rowMeans(df[-1]))
# TAXA X1992 X1993 X1994 X1995
# 1 Aba 3.960396 0.00000000 0.03960396 0.0000000
# 2 Abr 1.543210 0.06481481 0.07716049 2.3148148
# 3 Amp 1.473684 1.26315789 0.84210526 0.4210526
For a matrix: 对于矩阵:
m / rowMeans(m)
# 1992 1993 1994 1995
# Aba 3.960396 0.00000000 0.03960396 0.0000000
# Abr 1.543210 0.06481481 0.07716049 2.3148148
# Amp 1.473684 1.26315789 0.84210526 0.4210526
This finds the mean of each row then divides each row by its corresponding mean. 这将找到每一行的平均值,然后将每一行除以其相应的平均值。 The first assumes the first column in your example is actually a column, while the second assumes it is row names in a matrix. 第一个假定示例中的第一列实际上是一列,而第二个假定它是矩阵中的行名。
Data: 数据:
df <- structure(list(TAXA = structure(1:3, .Label = c("Aba", "Abr",
"Amp"), class = "factor"), X1992 = c(1L, 2L, 7L), X1993 = c(0,
0.084, 6), X1994 = c(0.01, 0.1, 4), X1995 = c(0L, 3L, 2L)), .Names = c("TAXA",
"X1992", "X1993", "X1994", "X1995"), class = "data.frame", row.names = c(NA,
-3L))
m <- structure(c(1, 2, 7, 0, 0.084, 6, 0.01, 0.1, 4, 0, 3, 2), .Dim = 3:4, .Dimnames = list(
c("Aba", "Abr", "Amp"), c("1992", "1993", "1994", "1995"
)))
Using the 'tidy data' approach (I copied the data from question to clipboard): 使用“整理数据”方法(我将数据从问题复制到剪贴板):
t <- read.table("clipboard", sep=" ", header=T)
library(tidyr)
library(dplyr)
t %>%
gather(year, value, -TAXA) %>%
group_by(TAXA) %>%
mutate(value=value / mean(value)) %>%
spread(year, value)
You get: 你得到:
Source: local data frame [3 x 5]
TAXA X1992 X1993 X1994 X1995
1 Aba 3.960396 0.00000000 0.03960396 0.0000000
2 Abr 1.543210 0.06481481 0.07716049 2.3148148
3 Amp 1.473684 1.26315789 0.84210526 0.4210526
It gathers the values from many columns into one. 它将来自许多列的值收集为一。 (They're getting the same treatment, they should be in one column.) Then it calculates the mean for each TAXA
separately, and reformats the data back into the wide format. (他们得到相同的处理,应该放在一栏中。)然后,它分别计算每个TAXA
的平均值,并将数据重新格式化为宽格式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.