简体   繁体   中英

Making code that is both iterative and recursive purely iterative

I have huge lists (>1,000,000 in most cases) that are partitioned into nxn grids. These lists are of some coordinates. I define a neighbouring relation between points - if they are within range of each other, they are 'neighbours' and put into a 'cluster'. We can add cells to these clusters, so cell A is a neighbour of cell B, B a neighbour of C, C is not a neighbour of A. A, B, and C would be in the same cluster.

Given that background, I have the following code that tries to assign points to clusters in Python 3.6:

for i in range(n):
    for j in range(n):
        tile = image[i][j]
        while tile:
            cell = tile.pop()
            cluster = create_and_return_cluster(cell, image_clusters[i][j], (i, j))
            clusterise(cell, cluster, (i, j), image, n, windows, image_clusters)

def clusterise(cell, cluster, tile_index, image, n, windows, image_clusters):
    neighbouring_windows, neighbouring_indices = get_neighbouring_windows(tile_index[0], tile_index[1], n, windows)
    neighbours = get_and_remove_all_neighbours(cell, image, tile_index, neighbouring_windows, neighbouring_indices)
    if neighbours:
        for neighbour, (n_i, n_j) in neighbours:
            add_to_current_cluster(cluster, image_clusters, neighbour, (n_j, n_j))
            clusterise(neighbour, cluster, (n_i, n_j), image, n, windows, image_clusters)

Because of the massive size of the lists, I've had issues with RecursionError and have been scouring the internet for suggestions on tail-recursion. The problem is that this algorithm needs to branch from neighbours to pick up neighbours of those neighbours, and so on. As you can imagine, this gets pretty big pretty quickly in terms of stack frames.

My question is: is it possible to make this algorithm use tail-recursion, or how would one go about making it tail-recursive? I know the cluster argument is essentially an accumulator in this case, but given how the lists shrink and the nasty for-loop in clusterise() I am not sure how to successfully convert to tail-recursion. Does anyone have any ideas? Ideally supported with an explanation.

NB: I am well aware that Python does not optimise tail-recursion by default, yes I am aware that other languages optimise tail-recursion. My question is whether it can be done in this case using Python. I don't want to change if I don't absolutely have to and much of my other code is already in Python.

Just use a queue or stack to track which neighbours to process next; the following function does exactly the same work as your recursive function, iteratively:

from collections import deque

def clusterise(cell, cluster, tile_index, image, n, windows, image_clusters):
    to_process = deque([(cell, tile_index)])
    while to_process:
        cell, tile_index = to_process.pop()
        neighbouring_windows, neighbouring_indices = get_neighbouring_windows(tile_index[0], tile_index[1], n, windows)
        neighbours = get_and_remove_all_neighbours(cell, image, tile_index, neighbouring_windows, neighbouring_indices)
        if neighbours:
            for neighbour, (n_i, n_j) in neighbours:
                add_to_current_cluster(cluster, image_clusters, neighbour, (n_j, n_j))
                to_process.append((neighbour, (n_i, n_j))

So instead of using the Python stack to track what still needs to be processed, we move the varying arguments ( cell and tile_index ) to a deque stack managed by the function instead, which isn't bound like the Python stack is. You can also use it as a queue (pop from the beginning instead of the end with to_process.popleft() ) for a breadth-first processing order. Note that your recursive solution process cells depth-first.

As a side note: yes, you can use a regular Python list as a stack too, but due to the nature of how a list object is grown and shrunk dynamically, for a stack the deque linked-list implementation is more efficient. And it's easier to switch between a stack and a queue this way. See Raymond Hettinger's remarks on deque performance .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM