简体   繁体   中英

R - Looping over Columns then Rows

I was wondering if anyone could help me with a problem I'm having in R. It involves looping over columns and rows. The example below should be clear hopefully. I have a 5x5 table below. Using row 1 as an example, I would like to count the number of times V2:V5 are lower than the value in V1, and express that as a decimal.

set.seed(1)
data=as.data.frame(replicate(5, rnorm(5)))

      V1         V2         V3          V4          V5
 1 -0.6264538 -0.8204684  1.5117812 -0.04493361  0.91897737
 2  0.1836433  0.4874291  0.3898432 -0.01619026  0.78213630
 3 -0.8356286  0.7383247 -0.6212406  0.94383621  0.07456498
 4  1.5952808  0.5757814 -2.2146999  0.82122120 -1.98935170
 5  0.3295078 -0.3053884  1.1249309  0.59390132  0.61982575


test=lapply(2:5,function(a){
ifelse(data[1,1]<=data[1,a],1,0)})
testtable=(as.data.frame(table(unlist(test)))[1,2])/4
testtable
[1] 0.25

This means that in row 1, only 1/4 values in V2:V5 are lower than V1. I'd like to use an additional loop for this to go through each row separately. I tried:

test2=lapply(2:5,function(a){
lapply(1:5,function(b){
ifelse(original_permuted_results[b,1]<=original_permuted_results[a,b],1,0)
(as.data.frame(table(unlist(test)))[1,2])/4})})

Resulting in

[[1]]
[[1]][[1]]
[1] 0.25

[[1]][[2]]
[1] 0.25

[[1]][[3]]
[1] 0.25

[[1]][[4]]
[1] 0.25

[[1]][[5]]
[1] 0.25


[[2]]
[[2]][[1]]
[1] 0.25

And continues like that, just printing out 0.25 as the result for the remainder of the loops. It should produce, ignoring the words in brackets:

(for row 1) 0.25  
(for row 2) 0.25
(for row 3) 0
(for row 4) 1
(for row 5) 0.25

I had a trawl through the archives but couldn't find anything. My actual data has 300+ rows and 10000 columns, but the output I'm trying to achieve is exactly the same. If anyone has any suggestions that would be very must appreciated. Thanks.

You don't need loops. You can take advantage of vectorization:

cat(paste("(for row", 1:nrow(df), ")", 
  rowSums(df[, 1] > df[, 2:5]) / 4),    # this is where it all happens
  sep="\n"
)

Produces:

(for row 1 ) 0.25
(for row 2 ) 0.25
(for row 3 ) 0
(for row 4 ) 1
(for row 5 ) 0.25

Here we take advantage of > coercing the RHS to a matrix in order to do the comparison.

does this work,

vec<-rowSums(data<data$V1)/4

> vec
[1] 0.25 0.25 0.00 1.00 0.25

Very similar to @BrodieG, but perhaps a little clearer:

# Find when each column is less than the first column.
lower.than.first<-sapply(data[2:5],function(x) x<data[,1])
# Calculate the proportion 
num.true<-rowSums(lower.than.first) # TRUE is 1, and FALSE is 0, when summing.
# Get the proportion.
props<-num.true/ncol(lower.than.first)
# [1] 0.25 0.25 0.00 1.00 0.25

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM