简体   繁体   English

如何遍历R中的列

[英]how to loop through columns in R

I have a very large data set including 250 string and numeric variables. 我有一个非常大的数据集,包括250个字符串和数字变量。 I want to compare one after another columns together. 我想一起比较另一列。 For example, I am going to compare (difference) the first variable with second one, third one with fourth one, fifth one with sixth one and so on. 例如,我将比较第一个变量与第二个变量,比较第三个变量与第四个变量,第五个变量与第六个变量,依此类推。
For example (The structure of the data set is something like this example), I want to compare number.x with number.y, day.x with day.y, school.x with school.y and etc. 例如(数据集的结构类似于此示例),我想比较number.x和number.y,day.x和day.y,school.x和school.y等。

number.x<-c(1,2,3,4,5,6,7)
number.y<-c(3,4,5,6,1,2,7)
day.x<-c(1,3,4,5,6,7,8)
day.y<-c(4,5,6,7,8,7,8)
school.x<-c("a","b","b","c","n","f","h")
school.y<-c("a","b","b","c","m","g","h")
city.x<- c(1,2,3,7,5,8,7)
city.y<- c(1,2,3,5,5,7,7) 

You mean, something like this? 你的意思是这样的?

> number.x == number.y
[1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
> length(which(number.x==number.y))
[1] 1
> school.x == school.y
[1]  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE
> test.day <- day.x == day.y
> test.day
[1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE

EDIT : Given your example variables above, we have: 编辑 :给定您上面的示例变量,我们有:

df <- data.frame(number.x,
             number.y,
             day.x,
             day.y,
             school.x,
             school.y,
             city.x,
             city.y,
             stringsAsFactors=FALSE)

n <- ncol(df)  # no of columns (assumed EVEN number)

k <- 1
comp <- list()  # comparisons will be stored here

while (k <= n-1) {
      l <- (k+1)/2
      comp[[l]] <- df[,k] == df[,k+1]
      k <- k+2
}

After which, you'll have: 之后,您将拥有:

> comp
[[1]]
[1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE

[[2]]
[1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE

[[3]]
[1]  TRUE  TRUE  TRUE  TRUE FALSE FALSE  TRUE

[[4]]
[1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE

To get the comparison result between columns k and k+1 , you look at the (k+1)/2 element of comp - ie to get the comparison results between columns 7 & 8, you look at the comp element 8/2=4 : 要获取列kk+1之间的比较结果,请查看comp(k+1)/2元素-即,要获取列7和8之间的比较结果,请查看comp元素comp 8/2=4

> comp[[4]]
[1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE

EDIT 2 : To have the comparisons as new columns in the dataframe: 编辑2 :要将比较作为数据框中的新列:

new.names <- rep('', n/2)
for (i in 1:(n/2)) {
     new.names[i] <- paste0('V', i)
}

cc <- as.data.frame(comp, optional=TRUE)
names(cc) <- new.names

df.new <- cbind(df, cc)

After which, you have: 之后,您将具有:

> df.new
  number.x number.y day.x day.y school.x school.y city.x city.y    V1    V2    V3    V4
1        1        3     1     4        a        a      1      1 FALSE FALSE  TRUE  TRUE
2        2        4     3     5        b        b      2      2 FALSE FALSE  TRUE  TRUE
3        3        5     4     6        b        b      3      3 FALSE FALSE  TRUE  TRUE
4        4        6     5     7        c        c      7      5 FALSE FALSE  TRUE FALSE
5        5        1     6     8        n        m      5      5 FALSE FALSE FALSE  TRUE
6        6        2     7     7        f        g      8      7 FALSE  TRUE FALSE FALSE
7        7        7     8     8        h        h      7      7  TRUE  TRUE  TRUE  TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM