[英]r programming subsetting a data frame multiple times for each value a vector and a data frame column
我有一個值為1:6的矢量,一個具有15分鍾倉的數據幀和一個掃描數據的數據幀。 數據幀如下所示。
垃圾桶
idMin5Bin BinStart BinEnd
22 22 2015-08-13 10:15:00 2015-08-13 10:19:59
23 23 2015-08-13 10:20:00 2015-08-13 10:24:59
24 24 2015-08-13 10:25:00 2015-08-13 10:29:59
25 25 2015-08-13 10:30:00 2015-08-13 10:34:59
26 26 2015-08-13 10:35:00 2015-08-13 10:39:59
27 27 2015-08-13 10:40:00 2015-08-13 10:44:59
汽車
idTrip Link_IDLink StartCluster_id Speed firstScan
10 10 5 19 47.961 2015-08-13 10:11:49
11 11 5 14 118.800 2015-08-13 10:12:33
12 11 5 14 118.800 2015-08-13 10:13:16
13 12 5 22 47.793 2015-08-13 10:11:21
15 14 5 28 56.321 2015-08-13 10:13:09
24 22 5 52 45.692 2015-08-13 10:14:50
對於向量中的每個值,我想引用cars表來查找所有具有與向量值匹配的LinkIDLink
值的汽車。
然后,我想通過將汽車的FirstScan
與bins表的BinStart
和BinEnd
表進行比較來子集所有匹配BinEnd
。
最后,我想繪制子集中的值。
我能想到的唯一策略是使用嵌套循環(我知道這是一個禁忌)。 即使使用嵌套循環,我也會從下面的示例代碼中得到以下錯誤。
for (i in 1:length(vector)){
tempcars<-cars[cars[,2]==i,]
for (k in 1:nrow(bins)){
tempcars1<-subset(tempcars, firstScan<bins[k,3] & firstScan>bins[k,2])
hist(tempcars1[,5], breaks =200)
}
}
Error in hist.default(unclass(x), unclass(breaks), plot = FALSE, warn.unused = FALSE, :
character(0) In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
我當然想擺脫使用循環的麻煩,但是對循環的任何幫助都是值得的。
這是開始的答案...希望對您有所幫助...
# Generate the data
theVec <- 1:6
someTimes <- seq(as.POSIXlt(Sys.time()), by = "sec", length = 300)
bins <- data.frame(idMin5Bin = 1:20, BinStart = someTimes[1+(15*(0:19))], BinEnd = someTimes[(15*(1:20))])
cars <- data.frame(Link_IDLink = rep(theVec, each = 20),
firstScan = sample(someTimes, 120, replace = T), Speed = runif(120, 30, 100))
# First split by Link_IDLink
subCars <- subset(cars, Link_IDLink %in% theVec)
carList <- split(subCars, subCars$Link_IDLink)
# Now "cut" the times for each element of the list
outList <- lapply(carList, function(df, binData) {
theBins <- c(binData$BinStart, binData$BinEnd [ nrow(binData)] )
df$idMin5Bin <- cut(df$firstScan, theBins, labels = binData$idMin5Bin )
df
}, binData = bins)
最終與此...
> head(outList[[1]])
Link_IDLink firstScan Speed isMin5Bin
1 1 2015-09-10 22:42:33 33.85446 17
2 1 2015-09-10 22:41:06 81.43807 11
3 1 2015-09-10 22:40:53 90.59927 10
4 1 2015-09-10 22:39:38 56.89429 5
5 1 2015-09-10 22:40:20 70.44760 8
6 1 2015-09-10 22:42:08 88.93505 15
您可以通過多種方式進行繪制-如果需要幫助,請告訴我。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.