Python, efficient way to operate on pair of coordinates

Question

I have a data file which has latitude and longitude information which I have stored as a list of tuples of the form

[(lat1, lon1), (lat1, lon1), (lat2, lon2), (lat3, lon3), (lat3, lon3)  ...]

As shown above the consecutive locations (lat, lon) may be the same if the location in the data file has not changed. Hence, the order is very important here. What I am interested in is a fairly efficient way to check when the coordinates change, lat1, lon1 -> lat2, lon2 etc. and then get the distance between these two coordinates.

I already have a function to get the distance of the form getDistance(lat1, lon1, lat2, lon2) which returns the calculated distance between these locations. I want to store these distances in a list from which I can do some plots later on.

Answer 1

You could combine a function that filters out duplicates with one that iterates over pairs:

First lets take care of eliminating duplicate subsequent entries in the list. Since we wish to preserve order, as well as allow duplicates that are not next to each other, we cannot use a simple set. So if we a list of coordinates such as [(0, 0), (4, 4), (4, 4), (1, 1), (0, 0)] the correct output would be [(0, 0), (4, 4), (1, 1), (0, 0)] . A simple function that accomplishes this is:

def filter_duplicates(items):
  """A generator that ignores subsequent entires that are duplicates

  >>> items = [0, 1, 1, 2, 3, 3, 3, 4, 1]
  >>> list(filter_duplicates(items))
  [0, 1, 2, 3, 4, 1]

  """
  prev = None
  for item in items:
    if item != prev:
        yield item 
        prev = item

The yield statement is like a return that doesn't actually return. Each time it is called it passes the value back to the calling function. See What does the "yield" keyword do in Python? for a better explanation.

This simply iterates through each item and compares it to the previous item. If the item is different it yields it back to the calling function and stores it as the current previous item. Another way to write this function would have been:

def filter_duplicates_2(items): result = [] prev = None for item in items: if item != prev: result.append(item) prev = item return result

Though the accomplish the same thing, this way would end up require more memory and would be less efficient because it has to create a new list to store everything.

Now that we have have a way to ensure that every item is different than its neighbors, we need to calculate the distance between subsequent pairs. A simple way to do this is:

def pairs(iterable):
    """A generate over pairs of items in iterable

    >>> list(pairs([0, 8, 2, 1, 3]))
    [(0, 8), (8, 2), (2, 1), (1, 3)]

    """
    iterator = iter(iterable)
    prev = next(iterator)
    for j in iterator:
        yield prev, j
        prev = j

This function is similar to the filter_duplicates function. It simply keeps track of the previous item that it observed, and for each item that it processes it yields that item and the previous item. The only trick it uses is that it assignes prev to the very first item in the list using the next() function call.

If we combine the two functions we end up with:

for (x1, y1), (x2, y2) in pairs(filter_duplicates(coords)):
   distance = getDistance(x1, y1, x2, y2)

Answer 2

Here's a way to do it using just functions from itertools :

from itertools import *

l = [...]
ks = (k for k,g in groupby(l))
t1, t2 = tee(ks)
t2.next() # advance so we get adjacent pairs
for k1, k2 in izip(t1, t2):
    # call getDistance on k1, k2

This groups adjacent equal elements, then uses a pair of tee 'd iterators to pull out adjacent pairs from the group list.

Using just groupby :

l = [...]
gs = itertools.groupby(l)
last, _ = gs.next()
for k, g in gs:
    # call getDistance on (last, k)
    last = k

Python, efficient way to operate on pair of coordinates

Question

2 answers

solution1
5 ACCPTED 2013-04-04 08:04:11

solution2
0 2013-04-04 08:16:23

Python, efficient way to operate on pair of coordinates

Question

2 answers

solution1 5 ACCPTED 2013-04-04 08:04:11

solution2 0 2013-04-04 08:16:23

solution1
5 ACCPTED 2013-04-04 08:04:11

solution2
0 2013-04-04 08:16:23