简体   繁体   English

python:自定义排序:不是纯粹的字典序而是反向和最短的公共优先

[英]python: custom sort: not purely lexicographical but reverse and shortest common first

Background背景

I want to sort reverse but not strict lexicographical and then it gets even more weird.. :P我想倒序排序但不是严格的字典序,然后它变得更加奇怪.. :P

The reason is that a proprietary software parses directories exactly the way I describe here and I want to copy that behavior.原因是专有软件完全按照我在此处描述的方式解析目录,我想复制该行为。

Requirements (in that order)要求(按此顺序)

  1. both: python2 and python3 compatible两者:python2和python3兼容
  2. Reverse lexicographical逆字典序
  3. shortest common first最短公共优先

Example data示例数据

The following is an example of (random ordered) input data for that python script:以下是该 Python 脚本的(随机排序)输入数据示例:

IA-test-PROD-me
ia-test-prod-me
ia-test-me-staging
ia-test-me
ia-test-STAGING-me
IA-test-me
IA-test-me-staging
ia-test-me-prod
IA-test-me-STAGING
IA-test-me-prod
IA-test-me-PROD
IA-test-STAGING-me

How it should look like它应该是什么样子

I store that in a list and need to sort it that it looks at the end like:我将它存储在一个列表中,并需要对其进行排序,使其看起来像:

ia-test-me
ia-test-prod-me
ia-test-me-staging
ia-test-me-prod
ia-test-STAGING-me
IA-test-me
IA-test-me-staging
IA-test-me-prod
IA-test-me-STAGING
IA-test-me-PROD
IA-test-STAGING-me
IA-test-PROD-me

Code代码

From what I understood sort() and sorted() are stable funcs which sort lexicographically.据我所知, sort()sorted()sorted()字典顺序排序的稳定函数。 But as I need to run all the above requirements I am stuck atm..但是由于我需要运行上述所有要求,所以我被困在 atm ..

def sortLexo(input_list):
    words = input_list.split()
    words.sort(reverse=True)
 
    for i in words:
        print(i)

The problem is sort() + reverse=True alone is not enough as it does not fulfill the requirement 3 (shortest first) above:问题是sort() + reverse=True单独是不够的,因为它不满足上面的要求 3(最短的第一个):

           <-------------. should be placed here
ia-test-prod-me          |
ia-test-me-staging      /|\
ia-test-me-prod          |
ia-test-me    -------> wrong
ia-test-STAGING-me
           <--------------- should be placed here
IA-test-me-staging        |
IA-test-me-prod          /|\
IA-test-me-STAGING        |
IA-test-me-PROD           |
IA-test-me    --------> wrong
IA-test-STAGING-me
IA-test-PROD-me

I've played around with groupby to sort by length but I get nowhere (my python kl isn't that deep) .. :(我玩过groupby按长度排序,但我一无所获(我的 python kl 没有那么深).. :(

I guess it is super easy to do for someone with good python know how.. any help appreciated !我想对于具有良好 Python 知识的人来说,这非常容易……任何帮助表示赞赏!

Trying to piece this together based on the description.试图根据描述将其拼凑在一起。 It seems like you want to pad the right side of the comparison string with the highest character you expect to receive (I use the character 0xFF, but if you're using Unicode instead of ASCII you might need a higher number).似乎您想用您希望收到的最高字符填充比较字符串的右侧(我使用字符 0xFF,但如果您使用 Unicode 而不是 ASCII,您可能需要更大的数字)。

MAX_LENGTH = max(len(word) for word in words)
sorted(words, key=lambda word: word + "\xFF" * (MAX_LENGTH - len(word)), reverse=True)

This will produce the following.这将产生以下内容。 Although it's different from your question, I can't understand what specification would produce the output in the question.尽管它与您的问题不同,但我无法理解什么规范会产生问题中的输出。

ia-test-prod-me
ia-test-me
ia-test-me-staging
ia-test-me-prod
ia-test-STAGING-me
IA-test-me
IA-test-me-staging
IA-test-me-prod
IA-test-me-STAGING
IA-test-me-PROD
IA-test-STAGING-me
IA-test-PROD-me

What the code does is this: the key function created the key for comparison.代码的作用是这样的:键函数创建了用于比较的键。 In this case, we take the word and pad the right side of it with the highest character that we would expect to find in the string;在这种情况下,我们取单词并用我们希望在字符串中找到的最高字符填充它的右侧; that is the code "\\xFF" * (MAX_LENGTH - len(word)) .那是代码"\\xFF" * (MAX_LENGTH - len(word)) It might seem strange to use the multiplication operator on a string but it works and creates a string the length that you multiply it by;在字符串上使用乘法运算符可能看起来很奇怪,但它可以工作并创建一个长度为乘法的字符串; in this case the difference between the maximum string length and the length of the current string.在这种情况下,最大字符串长度与当前字符串长度之间的差异。 In normal alphabetical sorting (like in the dictionary), words that are shorter come first in the sort order.在正常的字母排序中(如在字典中),较短的单词按排序顺序排在最前面。 Padding with the highest character makes strings that match until the end of the shorter string (like say ia-test-me and ia-test-me-staging ) put the shorter string last (in this case first since we reverse the whole list with reverse=True ).填充最高字符使字符串匹配到较短字符串的末尾(例如ia-test-meia-test-me-staging )将较短的字符串放在最后(在这种情况下首先因为我们用reverse=True )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM