[英]Is there an R function to compare between 1 column value with all values of another column?
I am working with two different large data set and trying to make use of mapply() to get iterative functions working.我正在处理两个不同的大型数据集,并尝试使用 mapply() 来使迭代函数工作。
The goal is to take each data point column wise from column 'a', and compare it against all the data points in column'b'.目标是从“a”列中逐列获取每个数据点,并将其与“b”列中的所有数据点进行比较。 If one element from 'a' is greater than any element of 'b' then 'compar' column is 'YES'.如果 'a' 中的一个元素大于 'b' 中的任何元素,则 'compar' 列为 'YES'。 df<-data.frame('a'=c(10,15,8),'b'=c(22,11,9)) df<-data.frame('a'=c(10,15,8),'b'=c(22,11,9))
and I want the output to be something like this:
a b compar
1 10 22 yes
2 15 11 yes
3 8 9 no
library(dplyr)
library(lubridate)
df=data.frame(
col1=c('2001-06-30 01:00:00','2001-07-01 01:00:00','2000-07-01 01:00:00'),
col2=c('2000-08-01 01:00:00','2003-07-01 01:00:00','2004-06-30 01:00:00')
)
df <- df %>%
#format dates
mutate_at(c('col1','col2'), ~ymd_hms(.)) %>%
#add the new column
mutate(compar = case_when(col1 > col2 ~ 'yes', TRUE ~ 'no'))
You can do你可以做
df$compar <- c('no', 'yes')[(df$a > df$b) + 1]
Or或者
df$compar <- ifelse(df$a > df$b, "yes", "no")
here's a working example of your original post.这是您原始帖子的工作示例。
col1 <- c(as.Date("2001-06-30 01:00:00 UTC"),as.Date("2001-07-01 01:00:00 UTC"),as.Date("2000-07-01 01:00:00 UTC"))
col2 <- c(as.Date("2000-08-01 01:00:00 UTC"),as.Date("2003-07-01 01:00:00 UTC"),as.Date("2004-06-30 01:00:00 UTC"))
df <- data.frame(col1, col2)
df
check <- function(elem, datafr){
if (TRUE %in% (elem > datafr$col2)) {
return (TRUE)
}
else {
return (FALSE)
}
}
addCol3 <- function(vector, dframe) {
vr <- c() ### an empty vector
for (i in 1:length(vector)) {
vr[i] <- check(vector[i], dframe)
}
return (vr)
}
df$col3 <- addCol3(df$col1, df)
df
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.