簡體   English   中英

子集 CSV 文件按日期循環並以不同名稱保存文件

[英]Subset CSV files in for loop by date and save file with different name

我有一個保存為lgages的水文測量儀列表,這些測量儀保存為list

gages<-read.delim("D:\\Baylor University\\Texas EFlow Spatial\\TX Gages List only txt.txt", colClasses = c("Gage_ID" = "character"))

# Set Working Directory
setwd("D:\\Baylor University\\Texas EFlow Spatial")

#using txt and defining as Character in order to not lose leading zeros in gage id
lgages<-as.list(gages$Gage_ID)

Code <- "00060"         #daily flow
Stat <- "00003"         #mean
start.date <- "1900-01-01"
end.date <- "2020-12-31" #statistical package requires full years of data, either calendar or water
dat.type <- ".csv"
path_2 <- "D:\\Baylor University\\Texas EFlow Spatial\\Yearly_Stats_Red\\"
file_list <- list.files(path_2)

我正在使用的列表的一個例子是這個

dput(file_list)
c("07223300.csv", 
"07227000.csv", 
"07227200.csv", 
"07227420.csv", 
"07227448.csv", 
"07227470.csv" 
)

我正在嘗試按year拆分 csv 文件中的數據。

   X   staid val      dates qualcode
1   1 7223300  76 1970-10-01        A
2   2 7223300  90 1970-10-02        A
3   3 7223300  94 1970-10-03        A
4   4 7223300 110 1970-10-04        A
5   5 7223300 124 1970-10-05        A
6   6 7223300 140 1970-10-06        A
7   7 7223300 156 1970-10-07        A
8   8 7223300 135 1970-10-08        A
9   9 7223300 117 1970-10-09        A
10 10 7223300 112 1970-10-10        A
    X   staid val      dates qualcode
1   1 7227000  20 1909-01-01        A
2   2 7227000  20 1909-01-02        A
3   3 7227000  20 1909-01-03        A
4   4 7227000  20 1909-01-04        A
5   5 7227000  20 1909-01-05        A
6   6 7227000  20 1909-01-06        A
7   7 7227000  20 1909-01-07        A
8   8 7227000  20 1909-01-08        A
9   9 7227000  20 1909-01-09        A
10 10 7227000  20 1909-01-10        A
    X   staid val      dates qualcode
1   1 7227200   0 1966-06-17      A R
2   2 7227200   0 1966-06-18      A R
3   3 7227200   0 1966-06-19      A R
4   4 7227200   0 1966-06-20      A R
5   5 7227200   0 1966-06-21      A R
6   6 7227200   0 1966-06-22      A R
7   7 7227200   0 1966-06-23      A R
8   8 7227200   0 1966-06-24      A R
9   9 7227200  71 1966-06-25        A
10 10 7227200 130 1966-06-26        A
    X   staid val      dates qualcode
1   1 7227420   0 2007-10-01      A R
2   2 7227420   0 2007-10-02      A R
3   3 7227420   0 2007-10-03      A R
4   4 7227420   0 2007-10-04      A R
5   5 7227420   0 2007-10-05      A R
6   6 7227420   0 2007-10-06      A R
7   7 7227420   0 2007-10-07      A R
8   8 7227420   0 2007-10-08      A R
9   9 7227420   0 2007-10-09      A R
10 10 7227420   0 2007-10-10      A R
    X   staid val      dates qualcode
1   1 7227448 0.0 1967-10-01      A R
2   2 7227448 0.0 1967-10-02      A R
3   3 7227448 0.0 1967-10-03      A R
4   4 7227448 0.0 1967-10-04      A R
5   5 7227448 0.0 1967-10-05      A R
6   6 7227448 0.0 1967-10-06      A R
7   7 7227448 0.5 1967-10-07        A
8   8 7227448 0.0 1967-10-08      A R
9   9 7227448 0.0 1967-10-09      A R
10 10 7227448 0.0 1967-10-10      A R
    X   staid val      dates qualcode
1   1 7227470 2.5 1968-10-01        A
2   2 7227470 2.5 1968-10-02        A
3   3 7227470 2.5 1968-10-03        A
4   4 7227470 4.0 1968-10-04        A
5   5 7227470 5.0 1968-10-05        A
6   6 7227470 4.0 1968-10-06        A
7   7 7227470 3.5 1968-10-07        A
8   8 7227470 4.0 1968-10-08        A
9   9 7227470 5.0 1968-10-09        A
10 10 7227470 6.0 1968-10-10        A

我遇到的問題是如何在 for 循環中執行此操作,因為有超過 800 個量具。 我嘗試使用下面的代碼,但似乎沒有得到任何輸出。

# Set Working Directory
setwd("D:\\Baylor University\\Texas EFlow Spatial\\Subsplit Yearly\\")

for (i in 1:length(lgages)) {
  data_daily <- read.csv(paste(path_2, file_list[i], sep = ""))
  header <- head(data_daily$dates,1)
   
  # Setup split by year
  date_str <-as.character(data_daily$dates)
  date_substr <-(substr(date_str,1,4))
  
  # Split files by year
  out <- split(data_daily, date_substr )
  for (j in names(out))
    new_file<- (paste0(j,":", lgages[i]))
    path_out="D:\\Baylor University\\Texas EFlow Spatial\\Subsplit Yearly\\"
    fileName=paste(path_out, new_file, dat.type, sep='')
    write.csv(data_daily,fileName)  
  }

理想情況下,我希望 output 采用格式year:gage.csv例如

1970:07223300.csv
1971:07223300.csv
1972:07223300.csv
1973:07223300.csv
etc...

但是,我在所需的工作目錄中得到的結果是文件類型的空白file ,並且只有年份的名稱

1970
1971
1972
1973
etc...

我認為您只想從單個量具加載數據,該量具上每年的子集,並保存 output。 然后繼續下一個量規。 那是對的嗎? 如果是這樣,這是我的方法:

# Load data from one gage at a time
for(i in 1:length(lgages)) {
  data_daily <- read.csv(paste0(path_2, file_list[i]))
  
  # Save data from each year in that gage, one at at time
  # This will continue for all years in a single gage
  for(j in unique(data_daily$year)){
    new_file <- paste0(j, ":", lgages[i])
    path_out <- "D:\\Baylor University\\Texas EFlow Spatial\\Subsplit Yearly\\"
    write.csv(x = data_daily[data_daily$year == j,], file = paste0(path_out, new_file))
  }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM