![](/img/trans.png)
[英]I want to subset a dataframe using a loop and save as csv files named as the group value
[英]Subset CSV files in for loop by date and save file with different name
我有一個保存為lgages
的水文測量儀列表,這些測量儀保存為list
。
gages<-read.delim("D:\\Baylor University\\Texas EFlow Spatial\\TX Gages List only txt.txt", colClasses = c("Gage_ID" = "character"))
# Set Working Directory
setwd("D:\\Baylor University\\Texas EFlow Spatial")
#using txt and defining as Character in order to not lose leading zeros in gage id
lgages<-as.list(gages$Gage_ID)
Code <- "00060" #daily flow
Stat <- "00003" #mean
start.date <- "1900-01-01"
end.date <- "2020-12-31" #statistical package requires full years of data, either calendar or water
dat.type <- ".csv"
path_2 <- "D:\\Baylor University\\Texas EFlow Spatial\\Yearly_Stats_Red\\"
file_list <- list.files(path_2)
我正在使用的列表的一個例子是這個
dput(file_list)
c("07223300.csv",
"07227000.csv",
"07227200.csv",
"07227420.csv",
"07227448.csv",
"07227470.csv"
)
我正在嘗試按year
拆分 csv 文件中的數據。
X staid val dates qualcode
1 1 7223300 76 1970-10-01 A
2 2 7223300 90 1970-10-02 A
3 3 7223300 94 1970-10-03 A
4 4 7223300 110 1970-10-04 A
5 5 7223300 124 1970-10-05 A
6 6 7223300 140 1970-10-06 A
7 7 7223300 156 1970-10-07 A
8 8 7223300 135 1970-10-08 A
9 9 7223300 117 1970-10-09 A
10 10 7223300 112 1970-10-10 A
X staid val dates qualcode
1 1 7227000 20 1909-01-01 A
2 2 7227000 20 1909-01-02 A
3 3 7227000 20 1909-01-03 A
4 4 7227000 20 1909-01-04 A
5 5 7227000 20 1909-01-05 A
6 6 7227000 20 1909-01-06 A
7 7 7227000 20 1909-01-07 A
8 8 7227000 20 1909-01-08 A
9 9 7227000 20 1909-01-09 A
10 10 7227000 20 1909-01-10 A
X staid val dates qualcode
1 1 7227200 0 1966-06-17 A R
2 2 7227200 0 1966-06-18 A R
3 3 7227200 0 1966-06-19 A R
4 4 7227200 0 1966-06-20 A R
5 5 7227200 0 1966-06-21 A R
6 6 7227200 0 1966-06-22 A R
7 7 7227200 0 1966-06-23 A R
8 8 7227200 0 1966-06-24 A R
9 9 7227200 71 1966-06-25 A
10 10 7227200 130 1966-06-26 A
X staid val dates qualcode
1 1 7227420 0 2007-10-01 A R
2 2 7227420 0 2007-10-02 A R
3 3 7227420 0 2007-10-03 A R
4 4 7227420 0 2007-10-04 A R
5 5 7227420 0 2007-10-05 A R
6 6 7227420 0 2007-10-06 A R
7 7 7227420 0 2007-10-07 A R
8 8 7227420 0 2007-10-08 A R
9 9 7227420 0 2007-10-09 A R
10 10 7227420 0 2007-10-10 A R
X staid val dates qualcode
1 1 7227448 0.0 1967-10-01 A R
2 2 7227448 0.0 1967-10-02 A R
3 3 7227448 0.0 1967-10-03 A R
4 4 7227448 0.0 1967-10-04 A R
5 5 7227448 0.0 1967-10-05 A R
6 6 7227448 0.0 1967-10-06 A R
7 7 7227448 0.5 1967-10-07 A
8 8 7227448 0.0 1967-10-08 A R
9 9 7227448 0.0 1967-10-09 A R
10 10 7227448 0.0 1967-10-10 A R
X staid val dates qualcode
1 1 7227470 2.5 1968-10-01 A
2 2 7227470 2.5 1968-10-02 A
3 3 7227470 2.5 1968-10-03 A
4 4 7227470 4.0 1968-10-04 A
5 5 7227470 5.0 1968-10-05 A
6 6 7227470 4.0 1968-10-06 A
7 7 7227470 3.5 1968-10-07 A
8 8 7227470 4.0 1968-10-08 A
9 9 7227470 5.0 1968-10-09 A
10 10 7227470 6.0 1968-10-10 A
我遇到的問題是如何在 for 循環中執行此操作,因為有超過 800 個量具。 我嘗試使用下面的代碼,但似乎沒有得到任何輸出。
# Set Working Directory
setwd("D:\\Baylor University\\Texas EFlow Spatial\\Subsplit Yearly\\")
for (i in 1:length(lgages)) {
data_daily <- read.csv(paste(path_2, file_list[i], sep = ""))
header <- head(data_daily$dates,1)
# Setup split by year
date_str <-as.character(data_daily$dates)
date_substr <-(substr(date_str,1,4))
# Split files by year
out <- split(data_daily, date_substr )
for (j in names(out))
new_file<- (paste0(j,":", lgages[i]))
path_out="D:\\Baylor University\\Texas EFlow Spatial\\Subsplit Yearly\\"
fileName=paste(path_out, new_file, dat.type, sep='')
write.csv(data_daily,fileName)
}
理想情況下,我希望 output 采用格式year:gage.csv
例如
1970:07223300.csv
1971:07223300.csv
1972:07223300.csv
1973:07223300.csv
etc...
但是,我在所需的工作目錄中得到的結果是文件類型的空白file
,並且只有年份的名稱
1970
1971
1972
1973
etc...
我認為您只想從單個量具加載數據,該量具上每年的子集,並保存 output。 然后繼續下一個量規。 那是對的嗎? 如果是這樣,這是我的方法:
# Load data from one gage at a time
for(i in 1:length(lgages)) {
data_daily <- read.csv(paste0(path_2, file_list[i]))
# Save data from each year in that gage, one at at time
# This will continue for all years in a single gage
for(j in unique(data_daily$year)){
new_file <- paste0(j, ":", lgages[i])
path_out <- "D:\\Baylor University\\Texas EFlow Spatial\\Subsplit Yearly\\"
write.csv(x = data_daily[data_daily$year == j,], file = paste0(path_out, new_file))
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.