I am trying to create a new column conditional on another column, a bit like a moving average or moving window but based on distance between points. Take for example row 2 with a CO2 of 399.935. I would like to have the mean of all the points within 100 m (traveled) of that point. In my example (looking at column CumDist), rows 1, 3, 4, 5 would be selected to calculate the mean. The column CumDist (*100,000 to have the units in meters) consists of cumulative distance traveled. I have 5000 points and obviously the width (or the number of rows) of the moving window will vary.
I tested over()
from the sp package, but it's problematic if the same road is taken more than once. I looked on the web for other solutions and I did not find anything that could help me.
dput(DF)
structure(list(CO2 = c(399.9350305, 399.9350305, 399.9350305,
400.0320031, 400.0320031, 400.0320031, 399.7718229, 399.7718229,
399.7718229, 399.3855075, 399.3855075, 399.3855075, 399.4708139,
399.4708139, 399.4708139, 400.0362474, 400.0362474, 400.0362474,
399.7556753, 399.7556753), lon = c(-103.7093538, -103.709352,
-103.7093492, -103.7093467, -103.7093455, -103.7093465, -103.7093482,
-103.7093596, -103.7094074, -103.7094625, -103.7094966, -103.709593,
-103.709649, -103.7096717, -103.7097349, -103.7097795, -103.709827,
-103.7099007, -103.709924, -103.7099887), lat = c(49.46972027,
49.46972153, 49.46971675, 49.46971533, 49.46971307, 49.4697124,
49.46970636, 49.46968214, 49.46960921, 49.46955984, 49.46953621,
49.46945809, 49.46938994, 49.46935281, 49.46924309, 49.46918635,
49.46914762, 49.46912566, 49.46912407, 49.46913321),distDiff = c(0.000342016147509882,
0.000191466419697602, 0.000569046320857002, 0.000240367540492089,
0.000265977754839834, 0.000103953049523505, 0.000682968856240796,
0.0028176007969857, 0.00882013898948418, 0.00678966015562509,
0.00360774024245839, 0.011149423290729, 0.00859796340323456,
0.00444526066124642, 0.0130344010874029, 0.00709037369666853,
0.00551435348701512, 0.00587377717110946, 0.00169806309901329,
0.00479849401022625), CumDist = c(0.000342016147509882, 0.000533482567207484,
0.00110252888806449, 0.00134289642855657, 0.00160887418339641,
0.00171282723291991, 0.00239579608916071, 0.00521339688614641,
0.0140335358756306, 0.0208231960312557, 0.0244309362737141, 0.0355803595644431,
0.0441783229676777, 0.0486235836289241, 0.0616579847163269, 0.0687483584129955,
0.0742627119000106, 0.08013648907112, 0.0818345521701333, 0.0866330461803596
)), .Names = c("X12CO2_dry", "coords.x1", "coords.x2", "V1",
"CumDist"), row.names = 2:21, class = "data.frame")
thanks, Martin
The window that belongs to the i-th row starts at n[i]
and ends at m[i]-1
. Hence the sum of the CO2-values in the i-th window is CumCO2[m[i]]-CumCO2[n[i]]
. (Notice that the indices in CumCO2
are shifted by 1, because of the leading 0.) Dividing this CO2-sum by the window size m[i]-n[i]
gives the values meanCO2
for the new column:
n <- sapply( df$CumDist,
function(x){
which.max( df$CumDist >= x-0.001 )
}
)
m <- sapply( df$CumDist,
function(x){
which.max( c(df$CumDist,Inf) > x+0.001 )
}
)
CumCO2 <- c( 0, cumsum(df$X12CO2) )
meanCO2 <- ( CumCO2[m] - CumCO2[n] ) / (m-n)
.
> n
[1] 1 1 1 2 3 3 5 8 9 10 11 12 13 14 15 16 17 18 19 20
> m
[1] 4 5 7 7 8 8 8 9 10 11 12 13 14 15 16 17 18 19 20 21
> meanCO2
[1] 399.9350 399.9593 399.9835 399.9932 399.9606 399.9606 399.9453 399.7718 399.7718 399.3855 399.3855 399.3855 399.4708 399.4708 399.4708 400.0362
[17] 400.0362 400.0362 399.7557 399.7557
>
Man you beat me to it with a cleaner solution mra68.
Here's mine using a few loops.
####################
for (j in 1:nrow(DF)){#Loop through all rows of your dataset
CO2list<-NULL ##Need to make a variable before storing to it in the loop
for(i in 1:nrow(DF)){##Loop through all distances in the table
if ((abs(DF$CumDist[i]-DF$CumDist[j]))<=0.001) {
##Check to see if difference in CumDist<=100/100000 for all entries
#CumDist[j] is point with the 100 meter window around it
CO2list<-c(CO2list,DF$X12CO2_dry[i])
##Store your CO2 entries that are within the 100 meter window to a vector
}
}
DF$CO2AVG[j]<-mean(CO2list)
#Get the mean of your list and store it to column named CO2AVG
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.