简体   繁体   中英

Cycling through columns in R

I have data collected from a survey. The csv file looks something like this.

1c x x 1e x x 2c x x 2e x x 

D  x x D  x x R  x x R  x x 

R  x x R  x x D  x x D  x x 

D  x x D  x x R  x x R  x x 

R  x x R  x x R  x x R  x x 

etc, etc...

The x's represent other data that are not being used in this analysis.

Responses from the 1c and 1e (or any paired columns) should be the same. It was done as a manipulation check to test if participants were paying attention. I want to count the number of "D"s and the number of "R"s, but if paired columns do not match they don't get counted.

Right now I am doing something like this:

final <- read("data.csv")


   for(i in 1:length(rownames(final))){
      if(final$X1c[i] == final$X1e[i]){
        count <- append(count, as.character(final$X1c[i]))
      } 
    }
   for(i in 1:length(rownames(final))){
      if(final$X2c[i] == final$X2e[i]){
        count <- append(count, as.character(final$X2c[i]))
      } 
    }

and on and on and on.

How can I do this so that I don't have to have a separate for loop for every single question?

You can simply have two different counters in the loops to capture both counts (or however many ou have):

final <- read("data.csv")


   for(i in 1:length(rownames(final))){
      if(final$X1c[i] == final$X1e[i]){
        count <- append(count, as.character(final$X1c[i]))
      if(final$X2c[i] == final$X2e[i]){
        count2 <- append(count2, as.character(final$X2c[i]))
      } 
      } 

But I would create an initial table or variables outside of the loop for count & count2 .

If you have a mega-ton of variables, you can create a list, table or some other vector of pairs to send into a nested loop to iterate over the column pairs to be compared.

However, if all you are interested in getting is the total number of times d==d in a column pair with multiple sets columns and possible factors in each column pair, you might consider using dplyr package.

If you use group_by to gather the two columns and then use summarize() with some logic and filter to pull out the equivalent pairs with each set of values, you can create tables of counts where they are the same.

here is a good link to using dplyr in this way:

dplyr tutorial using mtcars dataset

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM