简体   繁体   中英

R: for loop with if statement: running a function for multiple rows and populating a new matrix

I am currently writing my first for loop and I am having a little trouble with it. I've created a function "b.error" that I want to apply to each row of my dataset. This function uses multiple columns from each row. After i run my function, I would like to pull the rows which give a result to the function that is below a given defined threshold, and put them in a new matrix with an additional column for the result of the "b.error" function. I was thinking I would need to use and if statement for that part.

So far, here's what I've got:

b.max=200050612500
b.mean=65445001210.3952
b.sum=3176500943750
b.tmax=0.5166689375

data<-read.csv(file.choose(), header=T)
ID=data[, c(1)]
Max=data[, c(2)]
Mean=data[, c(3)]
Sum=data[, c(4)]
Tmax=data[, c(5)]

b.av.error=0.464689312424088
b.SE=0.0629050598187672
threshold=b.av.error+b.SE

b.error<-function(a,b,c,d)
{max.er<-abs(a-b.max)/max(a, b.max)
mean.er<-abs(b-b.mean)/max(b, b.mean)
sum.er<-abs(c-b.sum)/max(c, b.sum)
tmax.er<-abs(d-b.tmax)/max(d, b.tmax)
cum<-(max.er+mean.er+sum.er+tmax.er)/4
cum}

b.flashes<-matrix(data=NA,nrow=,ncol=6)
colnames(b.flashes)<-c("ID","BLmax","BLmean","CumSum","Tmax","CumError")

I was thinking something like this for my loop, but I am stuck on how to get my function to run for each row and how to populate the b.flashes matrix, especially if I don't know how many rows it will end up having.

for (i in 1:length(data)){
  error<-b.error(Max, Mean, Sum, Tmax)
  if (error<=threshold)
}

The files I import are set up like this. These are the first 10 rows of this particular dataset, but all of the datasets which I need to perform the "b.error" function on are different lengths

data
     ID       Blmax       Blmean      Cumsum     Tmax
1   b.1 3.00762e+10   8518829268 3.76000e+11 0.383330
2   b.2 1.67000e+11  89634946154 1.67000e+12 0.316670
3   b.3 1.95000e+11  78450661017 1.06000e+12 0.150000
4   b.4 2.28000e+11  59976231496 1.93000e+12 0.250000
5   b.5 2.17266e+10   6730313333 8.89497e+10 0.116670
6   b.6 2.33142e+10  14368725000 1.68000e+11 0.200000
7   b.7 1.85000e+11  42342807383 1.95000e+12 0.483330
8   b.8 1.84000e+11  40587636765 2.47000e+12 0.450000
9   b.9 2.49000e+11  59006598913 4.22000e+12 0.466670
10 b.10 6.09000e+11 207000000000 2.59000e+13 1.316700

Any suggestions?

Thanks!

Your function does not need to be applied to each row of data . Just run it as you have it written and it will return a vector of errors the same length as data . The reason is each operation you are doing takes a numeric array.

Just do

bflashes <- data
bflashes$CumError <- b.error(data$Blmax, data$Blmean, data$Cumsum, data$Tmax)

And error will be a vector. Then to filter for your criteria:

bflashes <- subset(bflashes, CumError <= threshold)

If you need that as a matrix: as.matrix(bflashes)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM