What's the difference between these two implementation?

Question

Assume I have a list of lists

S = [list1, list2, ...]

and I want to write a function find such that for an input x , the function will look for whether x is in some sublist of S , and then output that list or return None id x is not found.

(Note: the intersection of any two of the sublists is empty, so at most one list will be found.)

My code is very straightforward:

def find(x):
    for L in S:
        if x in L:
            return L
    return None

But I have seen someone write it like this:

def find(x):
    try:
        return next( L for L in S if x in L)
    except StopIteration:
        return None

I wonder what's the differences between the two codes? Is the second one more preferred than the first? (for example, from a software project viewpoint)

Answer 1

The difference is that the second version constructs a generator that yields items from the list S if you can find x in that item.

Then it tries to return the first object that's yielded from that generator by calling next on it.

Conceptually, there's really not much difference between the two snippets, note how they both employ for L in S -> if x in L , the first one as a traditional for loop with an if statement in its body, the second one in the form of a comprehension. Both versions are lazy, that is they return immediately when a match is found.

I think your code is perfectly fine. The second one could use a default value to avoid the manual exception handling, ie

return next((L for L in S if x in L), None)

which tries to return the first item yielded by the generator, or None if there's no such item. Is it worth it to construct a generator that is supposed to yield a single item here, and is it more readable? I'd say "probably not" in my opinion.

Answer 2

Your code is fine, but could be written more concise using list comprehensions. The second solution creates a generator using aa generator comprehension. Since it is known that the intersection of two lists is an empty set, the generator will only contain at most one element.

Using a generator here introduces some overhead though, a list comprehension can be much faster, if you only compare a few lists.

def find_list(x, S):
    ret = [L for L in S if x in L]
    return ret[0] if len(ret) else None

def find_iter(x, S):
    ret = (L for L in S if x in L)
    try:
        return next(ret)
    except StopIteration:
        return None

Runtime test In an interactive iPython shell:

In [1]: S = [["a"], ["b", "c",], ["d"]]

In [2]: %timeit find_list("b", S)
1000000 loops, best of 3: 475 ns per loop

In [3]: %timeit find_list("f", S)
1000000 loops, best of 3: 349 ns per loop

In [4]: %timeit find_iter("b", S)
1000000 loops, best of 3: 802 ns per loop

In [5]: %timeit find_iter("f", S)
100000 loops, best of 3: 1.58 µs per loop

Edit

Using the optimized generator version by @timgeb, the generator comprehension comes much closer:

def find_iter_opt(x, S):
    ret = (L for L in S if x in L)
    return next(ret, None)

In [8]: %timeit find_iter_opt("b", S)
1000000 loops, best of 3: 751 ns per loop

In [9]: %timeit find_iter_opt("f", S)
1000000 loops, best of 3: 597 ns per loop

What's the difference between these two implementation?

Question

2 answers

solution1
2 ACCPTED 2016-04-18 05:54:43

solution2
-1 2016-04-18 05:52:17

Runtime test In an interactive iPython shell:

Edit

What's the difference between these two implementation?

Question

2 answers

solution1 2 ACCPTED 2016-04-18 05:54:43

solution2 -1 2016-04-18 05:52:17

Runtime test In an interactive iPython shell:

Edit

solution1
2 ACCPTED 2016-04-18 05:54:43

solution2
-1 2016-04-18 05:52:17