A subset of my data for 2 individuals ( squirrelID
) can be found here .
My data looks as follows (only showing the relevant columns):
lat lon NatalMidden squirrelID type
60.9577819984406 -138.0347849708050 -27 NA Nest2017
60.9574120212346 -138.0345689691600 -27 NA NatalMidden
60.9578209742904 -138.0346520338210 -27 23054 Foray
60.9575380012393 -138.0348329991100 -27 23054 Foray
60.9576250053942 -138.0339069664480 -27 23054 Foray
60.957643026486 -138.0338829942050 -27 23054 Foray
60.9575670026243 -138.0348739866170 -27 23054 Foray
For example, for squirrelID
23054, it was located ( Foray
) multiple times ( type
column) and I have a corresponding latitude ( lat
) and longitude ( lon
) for each Foray
. I am trying to calculate the distance between each Foray
( type
column) and Nest2017
( type
column) for each individual ( squirrelID
) separately.
The below code works (and gives me a value of 15.11501 m), but it requires that I manually enter each point. This is not problematic, per say, but I am working with +2000 observations with more than 2 options per grid
, NatalMidden
, and squirrelID
columns.
library(Imap)
gdist(60.9578209742904,-138.0346520338210, 60.9577819984406, -138.0347849708050, units="m", verbose=FALSE)
Is there a way I could work within the dplyr
framework to group_by(squirrelID)
and then calculate the distances between each Foray
and its corresponding Nest2017
(which has the same NatalMidden
for both the Foray
and Nest2017
)?
My ultimate goal is to create a new column for the distance between the Foray
and Nest2017
for each squirrelID
.
UPDATE:
I have tried the following:
nests<-df %>% #creating a new data frame for Nest2017 points only
filter(type %in% "Nest2017") %>%
select(ID,lat,lon,ele,grid,NatalMidden,type)
foray<-df %>% #creating a new data frame for Foray points only
filter(type %in% "Foray") %>%
mutate(sq_id=as.factor(sq_id)) %>%
group_by(sq_id)
But these subsets do not work in the gdist
function (I get this error):
gdist(nests$lat, nests$lon, foray$lat, foray$lon, units="m", verbose=FALSE)
Error in while (abs(lamda - lamda.old) > 1e-11) { :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In Ops.factor(lon.1, rad) : ‘*’ not meaningful for factors
2: In Ops.factor(lat.1, rad) : ‘*’ not meaningful for factors
3: In Ops.factor(lon.2, rad) : ‘*’ not meaningful for factors
4: In Ops.factor(lat.2, rad) : ‘*’ not meaningful for factors
5: In lon.1 - lon.2 :
longer object length is not a multiple of shorter object length
6: In while (abs(lamda - lamda.old) > 1e-11) { :
the condition has length > 1 and only the first element will be used
I am not very familiar with dplyr
package but I think this would do what you are interested in:
# read data from the FigShare linked file
squirrel_data <- read.table("figshare.txt", header=T)
# split into 'Forays' and 'Nests'
nests <- squirrel_data %>%
filter(type %in% "Nest2017")
foray <- squirrel_data %>%
filter(type %in% "Foray")
# merge 'Forays' and 'Nests' by 'NatalMidden'
nests_foray <- inner_join(
nests, foray, by = "NatalMidden", suffix = c(".nest", ".foray"))
# calculate the distance for each row, keep 'SquirrelID' and 'Dist'
results <- nests_foray %>%
rowwise() %>%
mutate(dist = gdist(lat.nest, lon.nest,
lat.foray, lon.foray, units = "m")) %>%
select(squirrelID.foray, dist)
head(results, n = 3)
## A tibble: 3 x 2
# squirrelID.foray dist
# <int> <dbl>
#1 22684 14.03843
#2 22684 59.06996
#3 22684 13.40567
This is basically what I have proposed in my first comment, but using dplyr
functions instead of base
. The idea is simply to create inner join between "Foray" rows and "Nest2017" rows by "NatalMidded", then simply calculate the distance for each row and report it with "SquirrelID". I hope this helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.