简体   繁体   中英

Make For Loop and Spacial Computing Faster?

I am playing with a large dataset (~1.5m rows x 21 columns). Which includes a long, lat information of a transaction. I am computing the distance of this transaction from couple of target locations and appending this as new column to main dataset:

TargetLocation1<-data.frame(Long=XX.XXX,Lat=XX.XXX, Name="TargetLocation1", Size=ZZZZ)
TargetLocation2<-data.frame(Long=XX.XXX,Lat=XX.XXX, Name="TargetLocation2", Size=YYYY)

## MainData[6:7] are long and lat columns

MainData$DistanceFromTarget1<-distVincentyEllipsoid(MainData[6:7], TargetLocation1[1:2]) 
MainData$DistanceFromTarget2<-distVincentyEllipsoid(MainData[6:7], TargetLocation2[1:2]) 

I am using geosphere() package's distVincentyEllipsoid function to compute the distances. As you can imaging, distVincentyEllipsoid function is a computing intensive but it is more accurate (compared to other functions of the same package distHaversine(); distMeeus(); distRhumb(); distVincentySphere() )

Q1) It takes me about 5-10 mins to compute distances for each target location [I have 16 GB RAM and i7 6600U 2.81Ghz Intel CPU ], and I have multiple target locations. Is there any faster way to do this?

Q2) Then I am creating a new column for a categorical variable to mark each transaction if it belongs to market definition of target locations. A for loop with 2 if statements. Is there any other way to make this computation faster?

  MainData$TransactionOrigin<-"Other"

  for (x in 1:nrow(MainData)){
  if (MainData$DistanceFromTarget1[x]<=7000)
  MainData$TransactionOrigin[x]="Target1"
  if (MainData$DistanceFromTarget2[x]<=4000)
  MainData$TransactionOrigin[x]="Target2"
}

Thanks

Regarding Q2
This will run much faster if you lose the loop.

    MainData$TransactionOrigin <- "Other"
    MainData$TransactionOrigin[which(MainData$DistanceFromTarget1[x]<=7000)] <- "Target1"
    MainData$TransactionOrigin[which(MainData$DistanceFromTarget2[x]<=4000)] <- "Target2"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM