简体   繁体   中英

R - How to save the results of a loop within a loop?

Ok so the goal is to read csv files containing meteorlogical information specific to a weather station through a loop which provides the start date, end date, and the percentiles of the dry bulb temp and humidex along with the station number. So far I've been able to make a loop that provides me with this information for one csv file with data for 6-20 stations so it works well. Now I've attempted to set the first loop within another that runs multilpe csv files from a folder, but I can only get the station ID's not the other four results. I want to know how can I save all this data in the form of lists, separate vectors or a dataframe.

Here's the code.

files <- list.files("R_Test_Nicholas", pattern="*.csv", full.names=TRUE)
l <- length(files)
r <- NULL
first <- NULL
for(k in 1:l) {
  first <- read.csv(files[k])
  library(openair)
  station <-unique(first$STN_ID)
  n = length(station)
  first$date <- paste(first$LOCAL_YEAR, first$LOCAL_MONTH, first$LOCAL_DAY, first$LOCAL_HOUR, sep=" ")
  first$date <- as.POSIXct(first$date, format="%Y %m %d %H", "UTC")

  result7 <- NULL
  for (i in 1:n) {
    curr_station <- subset(first, STN_ID==station[i], select=c(date, DRY_BULB_TEMP, HUMIDEX, LOCAL_YEAR))
    test <- cutData(curr_station, type = "season")
    timers <- selectByDate(test, start = "YYYY-mm-dd", year = 1971:2000)
    trial <- subset(timers, season == "summer (JJA)")
    Ptile <- quantile(trial$DRY_BULB_TEMP, c(0.95), na.rm=T)
    Htile <- quantile(trial$HUMIDEX, c(0.95), na.rm=T)
    date_start <- min(trial$LOCAL_YEAR, na.rm=T)
    date_last <- max(trial$LOCAL_YEAR, na.rm=T)
    result7$stn[i] <- station[i]
    result7$Ptile[i] <- Ptile
    result7$Htile[i] <- Htile
    result7$date_start[i] <- date_start
    result7$date_last[i] <- date_last
  }
  r$STN_ID[k] <- result7
  r$Ptile[k] <- result7$Ptile[i]
  r$Htile[k] <- result7$Htile[i]
  r$date_start[k] <- result7$date_start[i]
  r$date_last[k] <-result7$date_last[i]
}
r
$STN_ID
$STN_ID[[1]]
[1] 6130 6137 6140 6157 6205 6207 6250 6256 6915 6916 6918 6919 7026 7558 9821

$STN_ID[[2]]
[1] 10808 10981 26823 26968 27746 29493 30309 30598 42103 43323 43383 45090 48568 48768 50309 50310 51537

$STN_ID[[3]]
[1] 6312 6330 6339 6345 6354 6356 6358 6369 6399 6442 6454 6465 6468 6486 6491 6501 6516 6923 7103 7162 7169 7173 8990 9033 9833

$STN_ID[[4]]
[1] 10078 10661 10792 10848 10859 10914 10936 10945 10969 10970 26824 26864 27141 27223 27592 27600 27868 30326 30668 31829 41575 42083 42243 43123 43124 43183 43403 43404 43405 43406 44363
[32] 44503 46007 47187 48668 49748 50133 50408 50620

$STN_ID[[5]]
[1] 6526 6547 7177

$STN_ID[[6]]
[1] 10800 10814 27846 30308 31029 41903 50621


$Ptile
[1] 23.9   NA 19.5   NA 21.2   NA

$Htile
[1] 29.35    NA 24.34    NA 26.96    NA

$date_start
[1] 1986  Inf 1986  Inf 1978  Inf

$date_last
[1] 2000 -Inf 1996 -Inf 2000 -Inf

You can use [[ operator to create and access list elements. Due to lack of reproducible data, the mtcars dataset is used for this demo.

data(mtcars)
NCol = ncol(mtcars)
ObjList=list()

for(i in 1:NCol){

ObjList[[i]]=list()
ObjList[[i]]['max']=max(mtcars[,i])
ObjList[[i]]['min']=min(mtcars[,i])
ObjList[[i]]['mean']=mean(mtcars[,i])
ObjList[[i]]['sd']=sd(mtcars[,i])
ObjList[[i]][["head"]]=head(mtcars)
ObjList[[i]][["tail"]]=tail(mtcars)


}

max_vec=as.vector(do.call(cbind,lapply(ObjList,function(x) x[["max"]])))
#[1]  33.900   8.000 472.000 335.000   4.930   5.424  22.900   1.000   1.000   5.000   8.000


max_orig=as.vector(apply(mtcars,2,max))
#[1]  33.900   8.000 472.000 335.000   4.930   5.424  22.900   1.000   1.000   5.000   8.000

setdiff(max_vec,max_orig)
#numeric(0)

Without a short dataset is difficult to reproduce your code and understand what you want to do, however I saw that you should use a Matrix to collect your data

result7 <- matrix(ncol=? ,nrow=?)) #instead of NULL

for (i in 1:n) { ...(the rest of ther code, then when you got a result, collect them:) result7[i,1] <- result A of your calculations will be on the i row and first column result7[i,2] <- result B of your calculations will be on the i row and second column }

you can transform then to a data Frame and give names to the columns:

result7 <- data.frame(result7)
colnames(georef.data) <- c("your column names here)

when you finish all the dataframes, just use rbind to join data Frames by rows, which can be defined by an id.

r$STN_ID[k] <- result7
  r$Ptile[k] <- result7[2]
  r$Htile[k] <- result7[3]
  r$date_start[k] <- result7[4]
  r$date_last[k] <-result7[5]

That's what I ended up doing which gave me a bunch of lists within a list which I'm going to change into dataframes. Thanks to those who took the time to answer.

This is a pretty specific example of a common question about concatenating new data to an existing data frame within a loop. For completeness, I'd like to share a generic way of doing this.

The idea is to start with an empty data frame, do your calculations within a loop, and then use rbind() to add the data. So...

df <- NULL
for (n in seq(1,10)){

df <- rbind(df,
            data.frame(n = n,
                       n.squared = n)
}

In this example, you would end up with a 10-row data frame with columns n and n.squared .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM