简体   繁体   中英

Python lambda function underscore-colon syntax explanation?

在以下“aDict”是字典的 Python 脚本中,“_: _[0]”在 lambda 函数中做了什么?

sorted(aDict.items(), key=lambda _: _[0])

In Python _ (underscore) is a valid identifier and can be used as a variable name, eg

>>> _ = 10
>>> print(_)
10

It can therefore also be used as the name of an argument to a lambda expression - which is like an unnamed function.

In your example sorted() passes tuples produced by aDict.items() to its key function. The key function returns the first element of that tuple which sorted() then uses as the key, ie that value to be compared with other values to determine the order.

Note that, in this case, the same result can be produced without a key function because tuples are naturally sorted according to the first element, then the second element, etc. So

sorted(aDict.items())

will produce the same result. Because dictionaries can not contain duplicate keys, the first element of each tuple is unique, so the second element is never considered when sorting.

Lets pick that apart.

1) Suppose you have a dict, di:

di={'one': 1, 'two': 2, 'three': 3}

2) Now suppose you want each of its key, value pairs:

 >>> di.items()
 [('three', 3), ('two', 2), ('one', 1)]

3) Now you want to sort them (since dicts are unordered):

>>> sorted(di.items())
[('one', 1), ('three', 3), ('two', 2)]

Notice that the tuples are sorted lexicographically -- by the text in the first element of the tuple. This is a equivalent to the t[0] of a series of tuples.

Suppose you wanted it sorted by the number instead. You would you use a key function:

>>> sorted(di.items(), key=lambda t: t[1])
[('one', 1), ('two', 2), ('three', 3)]

The statement you have sorted(aDict.items(), key=lambda _: _[0]) is just using _ as a variable name. It also does nothing, since aDict.items() produces tuples and if you did not use a key it sorts by the first element of the tuple anyway. The key function in your example is completely useless.

There might be a use case for the form (other than for tuples) to consider. If you had strings instead, then you would be sorting by the first character and ignoring the rest:

>>> li=['car','auto','aardvark', 'arizona']
>>> sorted(li, key=lambda c:c[0])
['auto', 'aardvark', 'arizona', 'car']

Vs:

>>> sorted(li)
['aardvark', 'arizona', 'auto', 'car']

I still would not use _ in the lambda however. The use of _ is for a throway variable that has minimal chance of side-effects. Python has namespaces that mostly makes that worry not a real worry.

Consider:

>>> c=22
>>> sorted(li, key=lambda c:c[0])
['auto', 'aardvark', 'arizona', 'car']
>>> c
22

The value of c is preserved because of the local namespace inside the lambda .

However (under Python 2.x but not Python 3.x) this can be a problem:

>>> c=22
>>> [c for c in '123']
['1', '2', '3']
>>> c
'3'

So the (light) convention became using _ for a variable either in the case of a list comprehension or a tuple expansion, etc where you worry less about trampling on one of your names. The message is: If it is named _ , I don't really care about it except right here...

In Python, lambda is used to create an anonymous function. The first underscore in your example is simply the argument to the lambda function. After the colon (ie function signature), the _[0] retrieves the first element of the variable _ .

Admittedly, this can be confusing; the lambda component of your example could be re-written as lambda x: x[0] with the same result. Conventionally, though, underscore variable names in Python are used for "throwaway variables". In this case, it implies that the only thing we care about in each dictionary item is the key. Nuanced to a fault, perhaps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM