I have a very large data set including 250 string and numeric variables. I want to compare one after another columns together. For example, I am going to compare (difference) the first variable with second one, third one with fourth one, fifth one with sixth one and so on.
For example (The structure of the data set is something like this example), I want to compare number.x with number.y, day.x with day.y, school.x with school.y and etc.
number.x<-c(1,2,3,4,5,6,7)
number.y<-c(3,4,5,6,1,2,7)
day.x<-c(1,3,4,5,6,7,8)
day.y<-c(4,5,6,7,8,7,8)
school.x<-c("a","b","b","c","n","f","h")
school.y<-c("a","b","b","c","m","g","h")
city.x<- c(1,2,3,7,5,8,7)
city.y<- c(1,2,3,5,5,7,7)
You mean, something like this?
> number.x == number.y
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE
> length(which(number.x==number.y))
[1] 1
> school.x == school.y
[1] TRUE TRUE TRUE TRUE FALSE FALSE TRUE
> test.day <- day.x == day.y
> test.day
[1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE
EDIT : Given your example variables above, we have:
df <- data.frame(number.x,
number.y,
day.x,
day.y,
school.x,
school.y,
city.x,
city.y,
stringsAsFactors=FALSE)
n <- ncol(df) # no of columns (assumed EVEN number)
k <- 1
comp <- list() # comparisons will be stored here
while (k <= n-1) {
l <- (k+1)/2
comp[[l]] <- df[,k] == df[,k+1]
k <- k+2
}
After which, you'll have:
> comp
[[1]]
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE
[[2]]
[1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE
[[3]]
[1] TRUE TRUE TRUE TRUE FALSE FALSE TRUE
[[4]]
[1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE
To get the comparison result between columns k
and k+1
, you look at the (k+1)/2
element of comp
- ie to get the comparison results between columns 7 & 8, you look at the comp
element 8/2=4
:
> comp[[4]]
[1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE
EDIT 2 : To have the comparisons as new columns in the dataframe:
new.names <- rep('', n/2)
for (i in 1:(n/2)) {
new.names[i] <- paste0('V', i)
}
cc <- as.data.frame(comp, optional=TRUE)
names(cc) <- new.names
df.new <- cbind(df, cc)
After which, you have:
> df.new
number.x number.y day.x day.y school.x school.y city.x city.y V1 V2 V3 V4
1 1 3 1 4 a a 1 1 FALSE FALSE TRUE TRUE
2 2 4 3 5 b b 2 2 FALSE FALSE TRUE TRUE
3 3 5 4 6 b b 3 3 FALSE FALSE TRUE TRUE
4 4 6 5 7 c c 7 5 FALSE FALSE TRUE FALSE
5 5 1 6 8 n m 5 5 FALSE FALSE FALSE TRUE
6 6 2 7 7 f g 8 7 FALSE TRUE FALSE FALSE
7 7 7 8 8 h h 7 7 TRUE TRUE TRUE TRUE
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.