简体   繁体   English

根据不同字符的数量对字符串进行排序

[英]Sort strings based on the number of distinct characters

I am confused why the code below, which is looking to sort strings based on their number of distinct alphabets, requires the set() and list() portions. 我感到困惑的是,为什么下面的代码要根据不同字母的数量对字符串进行排序,而需要set()list()部分。

strings = ['foo', 'card', 'bar', 'aaaa', 'abab']

strings.sort(key = lambda x: len(set(list(x))))
print(strings)

Thanks 谢谢

In fact, the key of that code is the set() function. 实际上,该代码的关键是set()函数。 Why? 为什么? Because it will return a set with not-repeated elements. 因为它将返回不重复元素的集合。 For example: 例如:

set('foo') -> ['f', 'o']
set('aaaa') -> ['a']
set('abab') -> ['a', 'b']

Then, in order to sort based on the number of distinct alphabets, the len() function is used. 然后,为了进行排序基于不同字母的数目, len()使用的功能。

Nice question! 好问题! Let's peel the layers off the sort() call. 让我们从sort()调用中剥离图层。

According to the Python docs on sort and sorted , 根据Python文档sortsorted

key specifies a function of one argument that is used to extract a comparison key from each list element: key=str.lower. key指定一个参数的功能,该参数用于从每个列表元素中提取比较键:key = str.lower。 The default value is None (compare the elements directly). 默认值为“无”(直接比较元素)。

That is, sort takes a keyword argument key and expects it to be a function. 也就是说, sort使用关键字参数key并期望它是一个函数。 Specifically, it wants a key(x) function that will be used to generate a key value for each string in strings list, instead of the usual lexical ordering. 具体来说,它需要一个key(x)函数,该函数将用于为strings列表中的每个字符串生成一个键值,而不是通常的词法顺序。 In the Python shell: 在Python Shell中:

>>> key = lambda x: len(set(list(x)))
>>> ordering = [key(x) for x in strings]
>>> ordering
[2, 3, 1, 2, 2, 4]

This could be any ordering scheme you like. 这可以是您喜欢的任何订购方案。 Here, we want to order by the number of unique letters . 在这里,我们要按唯一字母的数量进行排序。 That's where set and list come in. list("foo") will result in ['f', 'o', 'o'] . 这就是setlist的输入位置list("foo")将导致['f', 'o', 'o'] Then we get len(list('foo')) == 3 -- the length of the word. 然后我们得到len(list('foo')) == 3单词的长度。 Not the number of unique characters. 不是唯一字符的数量。

>>> key2 = lambda x: len(list(x))
>>> ordering2 = [key2(x) for x in strings]
>>> ordering2
[3, 3, 4, 4, 4, 4]

So we use set and list to get a set of characters. 因此,我们使用setlist获得一组字符。 A set is like a list , except they only include the unique elements of a list . set就像一个list ,只不过它们只包含list的唯一元素。 For instance we can make a list of characters for any word like this: 例如,我们可以列出任何单词的字符列表,如下所示:

>>> list(strings[0])
['f', 'o', 'o']

And a set: 和一套:

>>> set(list(strings[0]))
set(['o', 'f'])

The len() of that set is 2, so when sort goes to compare the "foo" in strings[0] to all the other strings[x] in strings , it uses this list. len()是的set为2,所以当sort去的“富”比较strings[0]到所有其他strings[x]strings ,它使用这个列表。 For example: 例如:

>>> (len(set(strings[0][:])) < len(set(strings[1][:])))
True

Which gives us the ordering we want. 这给了我们想要的订购。

EDIT: @PeterGibson pointed out above that list(string[i]) isn't needed. 编辑:@PeterGibson指出上面不需要list(string[i]) This is true because strings are iterable in Python, just like lists: 这是真的,因为字符串在Python中是可迭代的,就像列表:

>>> set("foo")
set(['o', 'f'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM