[英]R apply custom vectorised function to row in dataframe, specific columns
This should be simple but I just can't get the apply
to communicate with my vectorised function.这应该很简单,但我无法
apply
与我的矢量化函数进行通信。
Test data is: df <- data.frame(a = 1:3, b1 = c(4:5, NA), b2 = c(5,6,5))
Looks like this:测试数据为:
df <- data.frame(a = 1:3, b1 = c(4:5, NA), b2 = c(5,6,5))
看起来像这样:
a b1 b2
1 1 4 5
2 2 5 6
3 3 NA 5
Custom function checks returns a vector to indicate whether values fall in a given interval.自定义函数检查返回一个向量以指示值是否落在给定的区间内。
validScore <- function(x, a, b) {
is.na(x) == FALSE &
x%%1 == 0 &
findInterval(x, c(a,b), rightmost.closed = TRUE) == 1
}
Test of custom function: validScore(c(3, 3.5, 6, NA), 1, 5)
returns the logical vector TRUE FALSE FALSE FALSE
as expected.自定义函数的测试:
validScore(c(3, 3.5, 6, NA), 1, 5)
按预期返回逻辑向量TRUE FALSE FALSE FALSE
。
I want to run the custom function on the row defined by the columns b1 and b2.我想在 b1 和 b2 列定义的行上运行自定义函数。 This would return TRUE FALSE FALSE (that is T on (b1=4,b2=5), F on (b1=5,b2=6) and F on (b1=NA,b2=5)).
这将返回 TRUE FALSE FALSE(即 T on (b1=4,b2=5),F on (b1=5,b2=6) 和 F on (b1=NA,b2=5))。
The answer Call apply-like function on each row of dataframe with multiple arguments from each row for selecting the columns, and how to apply a function to every row of a matrix (or a data frame) in R together suggest the following:答案Call apply-like function on each row of dataframe with multiple arguments from each row to select the columns,以及如何将函数应用于 R 中矩阵(或数据框)的每一行,共同提出以下建议:
library(dplyr)
apply(select(df, b1:b2), 1, function(x) validScore(x, 1, 5))
but that doesn't actually send the row to the function, instead assessing each value individually, so output is:但这实际上并没有将行发送到函数,而是单独评估每个值,因此输出是:
[,1] [,2] [,3]
b1 TRUE TRUE FALSE
b2 TRUE FALSE TRUE
Sticking a rowwise() into the middle like select(df, b1:b2) %>% rowwise() %>% apply(1, function(x) validScore(x, 1, 5))
makes no difference.像
select(df, b1:b2) %>% rowwise() %>% apply(1, function(x) validScore(x, 1, 5))
一样将 rowwise() 插入中间没有区别。
I thought it might by something to do with the form that the dplyr select returned, but apply(df[, c("b1", "b2")], 1, function(x) validScore(x, 1, 5))
also generates the same result.我认为这可能与 dplyr select 返回的形式有关,但是
apply(df[, c("b1", "b2")], 1, function(x) validScore(x, 1, 5))
也会产生同样的结果。
You don't need dplyr
or plyr
.你不需要
dplyr
或plyr
。 You can just use base R.你可以只使用基础 R。
The first thing to do is to make validScore
return only a single TRUE
or FALSE
.首先要做的是让
validScore
只返回一个TRUE
或FALSE
。 This can be done using the all
function这可以使用
all
函数完成
validScore <- function(x, a, b) {
test = is.na(x) == FALSE &
x %% 1 == 0 &
findInterval(x, c(a,b), rightmost.closed = TRUE) == 1
all(test)
}
After that just use the standard apply
之后只需使用标准
apply
## Select columns 2 & 3
apply(df[, 2:3], 1, validScore, a=1, b=8)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.