简体   繁体   English

将自定义函数应用于 r 中的每一行

[英]Applying a custom function to every row in r

I created a function to calculate the rollmean of a row in a dataframe:我创建了一个函数来计算数据框中一行的 rollmean:

rollmean_circular <- function(x) {t(rollmean(t(cbind(x[9:10],x,x[1:2])),5))}

df <- structure(list(X1 = c(5L, 5L, 9L, 0L, 9L, 10L, 10L, 1L, 0L, 10L
), X2 = c(6L, 8L, 6L, 9L, 7L, 5L, 0L, 7L, 5L, 8L), X3 = c(10L, 
7L, 2L, 1L, 2L, 10L, 2L, 9L, 6L, 4L), X4 = c(6L, 0L, 9L, 1L, 
6L, 8L, 3L, 7L, 8L, 1L), X5 = c(0L, 9L, 8L, 3L, 1L, 8L, 3L, 9L, 
5L, 2L), X6 = c(0L, 10L, 9L, 10L, 3L, 1L, 6L, 0L, 6L, 9L), X7 = c(9L, 
10L, 0L, 10L, 10L, 9L, 0L, 1L, 10L, 2L), X8 = c(2L, 6L, 3L, 7L, 
7L, 9L, 8L, 9L, 1L, 0L), X9 = c(0L, 8L, 8L, 9L, 0L, 5L, 9L, 9L, 
4L, 8L), X10 = c(1L, 4L, 3L, 0L, 1L, 7L, 3L, 6L, 5L, 0L)), class = "data.frame", row.names = c(NA, 
-10L))

   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1   5  6 10  6  0  0  9  2  0   1
2   5  8  7  0  9 10 10  6  8   4
3   9  6  2  9  8  9  0  3  8   3
4   0  9  1  1  3 10 10  7  9   0
5   9  7  2  6  1  3 10  7  0   1
6  10  5 10  8  8  1  9  9  5   7
7  10  0  2  3  3  6  0  8  9   3
8   1  7  9  7  9  0  1  9  9   6
9   0  5  6  8  5  6 10  1  4   5
10 10  8  4  1  2  9  2  0  8   0

What this function does is given a vector, it will append the last 2 element to the front and first 2 element to the back and then do a rollmean so there will not be any NAs at the front or back.该函数的作用是给定一个向量,它将最后 2 个元素附加到前面,将前 2 个元素附加到后面,然后执行 rollmean,因此前面或后面不会有任何 NA。

It works perfectly when I apply to 1 row of a df.当我应用于 df 的 1 行时,它工作得很好。

r = df[1,]
rollmean_circular[r]

  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
1  4.4  5.6  5.4  4.4    5  3.4  2.2  2.4  3.4   2.8

However, when I use apply to apply this function to every row of my dataframe, it returns a logical(0) .但是,当我使用 apply 将此函数应用于数据帧的每一行时,它返回一个logical(0)

apply(df,1,rollmean_circular)

logical(0)

Can I know what I am missing?我能知道我错过了什么吗?

When I apply another function that gives the same output for a single row, it works:当我应用另一个为单行提供相同输出的函数时,它可以工作:

stdize <- function(x, na.rm=T) {(x - min(x, na.rm=T)) / (max(x, na.rm=T) - min(x, na.rm=T))}

stdize(r)

   X1  X2 X3  X4 X5 X6  X7  X8 X9 X10
1 0.5 0.6  1 0.6  0  0 0.9 0.2  0 0.1

apply(df,1,stdize)

    [,1] [,2]      [,3] [,4] [,5]      [,6] [,7]      [,8] [,9] [,10]
X1   0.5  0.5 1.0000000  0.0  0.9 1.0000000  1.0 0.1111111  0.0   1.0
X2   0.6  0.8 0.6666667  0.9  0.7 0.4444444  0.0 0.7777778  0.5   0.8
X3   1.0  0.7 0.2222222  0.1  0.2 1.0000000  0.2 1.0000000  0.6   0.4
X4   0.6  0.0 1.0000000  0.1  0.6 0.7777778  0.3 0.7777778  0.8   0.1
X5   0.0  0.9 0.8888889  0.3  0.1 0.7777778  0.3 1.0000000  0.5   0.2
X6   0.0  1.0 1.0000000  1.0  0.3 0.0000000  0.6 0.0000000  0.6   0.9
X7   0.9  1.0 0.0000000  1.0  1.0 0.8888889  0.0 0.1111111  1.0   0.2
X8   0.2  0.6 0.3333333  0.7  0.7 0.8888889  0.8 1.0000000  0.1   0.0
X9   0.0  0.8 0.8888889  0.9  0.0 0.4444444  0.9 1.0000000  0.4   0.8
X10  0.1  0.4 0.3333333  0.0  0.1 0.6666667  0.3 0.6666667  0.5   0.0

Seems you're confusing vectors and matrices in your function.似乎您在函数中混淆了向量和矩阵。 You could unlist in the function and t ranspose later.你可以unlist在功能和t后ranspose。

rollmean_circular <- function(x) zoo::rollmean(unlist(c(x[9:10], x, x[1:2])),5)

t(apply(df, 1, rollmean_circular))
#       X1  X2  X3  X4  X5  X6  X7  X8  X9 X10
#  [1,] 4.4 5.6 5.4 4.4 5.0 3.4 2.2 2.4 3.4 2.8
#  [2,] 6.4 4.8 5.8 6.8 7.2 7.0 8.6 7.6 6.6 6.2
#  [3,] 5.6 5.8 6.8 6.8 5.6 5.8 5.6 4.6 4.6 5.8
#  [4,] 3.8 2.2 2.8 4.8 5.0 6.2 7.8 7.2 5.2 5.0
#  [5,] 3.8 5.0 5.0 3.8 4.4 5.4 4.2 4.2 5.4 4.8
#  [6,] 7.4 8.0 8.2 6.4 7.2 7.0 6.4 6.2 8.0 7.2
#  [7,] 4.8 3.6 3.6 2.8 2.8 4.0 5.2 5.2 6.0 6.0
#  [8,] 6.4 6.0 6.6 6.4 5.2 5.2 5.6 5.0 5.2 6.4
#  [9,] 4.0 4.8 4.8 6.0 7.0 6.0 5.2 5.2 4.0 3.0
# [10,] 6.0 4.6 5.0 4.8 3.6 2.8 4.2 3.8 4.0 5.2

This can also be done in base R (w/ most of the credits to @MattiPastell ):这也可以在基础 R 中完成(大部分归功于@MattiPastell ):

fun <- function(x, n=5) na.omit(filter(c(tail(x, 2), x, head(x, 2)), rep(1 / n, n), sides=2))
t(apply(df, 1, fun))
#       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#  [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
#  [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
#  [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
#  [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
#  [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
#  [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
#  [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
#  [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
#  [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
# [10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

rollmean will automatically work on every column of its input so this can be done directly eliminating the apply : rollmean将自动处理其输入的每一列,因此可以直接消除apply

library(zoo)
t(rollmean(t(cbind(df[9:10], df, df[1:2])), 5))

or using stats::filter in the base of R which will also work on every column:或者在 R 的基础中使用stats::filter也适用于每一列:

t(filter(t(df), rep(1, 5)/5, circular = TRUE))

Either of tehse give this matrix:任何一个都给出了这个矩阵:

      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  4.4  5.6  5.4  4.4  5.0  3.4  2.2  2.4  3.4   2.8
 [2,]  6.4  4.8  5.8  6.8  7.2  7.0  8.6  7.6  6.6   6.2
 [3,]  5.6  5.8  6.8  6.8  5.6  5.8  5.6  4.6  4.6   5.8
 [4,]  3.8  2.2  2.8  4.8  5.0  6.2  7.8  7.2  5.2   5.0
 [5,]  3.8  5.0  5.0  3.8  4.4  5.4  4.2  4.2  5.4   4.8
 [6,]  7.4  8.0  8.2  6.4  7.2  7.0  6.4  6.2  8.0   7.2
 [7,]  4.8  3.6  3.6  2.8  2.8  4.0  5.2  5.2  6.0   6.0
 [8,]  6.4  6.0  6.6  6.4  5.2  5.2  5.6  5.0  5.2   6.4
 [9,]  4.0  4.8  4.8  6.0  7.0  6.0  5.2  5.2  4.0   3.0
[10,]  6.0  4.6  5.0  4.8  3.6  2.8  4.2  3.8  4.0   5.2

Depending on the needs of your application you could consider storing these series in columns rather than rows in which case the transposes would not be needed.根据您的应用程序的需要,您可以考虑将这些系列存储在列而不是行中,在这种情况下不需要转置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM