简体   繁体   中英

Sorting list of dictionaries with primary key from list of keywords and alphabetical order as secondary key

I have a list of dictionaries and I want them to be sorted according to a list of keywords as primary key and otherwise equal entries alphabetically.

Currently I sort first alphabetically and then according to the provided keywords which produces the desired result because of the stable sorting algorithm being used. However, I think this can be done in one step, but I don't know why. Can anyone help?

Secondly I would want to be able to use keywords instead of exact matches for the keyword sorting part. How can I do this?

Here's my code so far:

# Define the keywords I want to see first
preferred_projects = ['project one', 'project two', 'project three']

# example data
AllMyProjectsFromaDatasource = [{ 'name': 'project two', 'id': 5, 'otherkey': 'othervalue'},
                                { 'name': 'project three', 'id': 1, 'otherkey': 'othervalue'},
                                { 'name': 'project one', 'id': 3, 'otherkey': 'othervalue'},
                                { 'name': 'abc project', 'id': 6, 'otherkey': 'othervalue'},
                                { 'name': 'one project', 'id': 9, 'otherkey': 'othervalue'}
                               ]    


def sort_by_preferred(key):
    """Sort lists out by prefered name."""
     sortkey = key['name']
     return preferred.index(sortkey) if sortkey in preferred else len(preferred)

# First sort alphabetical    
AllProjects = sorted(AllMyProjectsFromaDatasource,
                     key=lambda k: k['name'])

# Then sort by keyword
preferred = preferred_projects
AllProjects.sort(key=sort_by_preferred)

So actually I want to define my "sorting filter" just like this:

preferred_projects = ['one', 'two', 'three']

And have the list sorted like this:

[{ 'name': 'one project', 'id': 9, 'otherkey': 'othervalue'}
 { 'name': 'project one', 'id': 3, 'otherkey': 'othervalue'},
 { 'name': 'project two', 'id': 5, 'otherkey': 'othervalue'},
 { 'name': 'project three', 'id': 1, 'otherkey': 'othervalue'},
 { 'name': 'abc project', 'id': 6, 'otherkey': 'othervalue'},]    

You could create a suitable tuple as your sort key. The first part is the index into your preferred_projects with a default value being the greatest index. The second part would be the name to allow an alphabetical sort:

preferred_projects = ['project one', 'project two', 'project three']

def sort_by(entry):
    name = entry['name']

    try:
        index = preferred_projects.index(name)
    except ValueError:
        index = len(preferred_projects)

    return (index, name)

AllMyProjectsFromaDatasource = [
    { 'name': 'project two', 'id': 5, 'otherkey': 'othervalue'},
    { 'name': 'project three', 'id': 1, 'otherkey': 'othervalue'},
    { 'name': 'project one', 'id': 3, 'otherkey': 'othervalue'},
    { 'name': 'abc project', 'id': 6, 'otherkey': 'othervalue'},
    { 'name': 'one project', 'id': 9, 'otherkey': 'othervalue'}]    

AllProjects = sorted(AllMyProjectsFromaDatasource, key=sort_by)

for p in AllProjects:
    print p

Giving you the following output:

{'otherkey': 'othervalue', 'name': 'project one', 'id': 3}
{'otherkey': 'othervalue', 'name': 'project two', 'id': 5}
{'otherkey': 'othervalue', 'name': 'project three', 'id': 1}
{'otherkey': 'othervalue', 'name': 'abc project', 'id': 6}
{'otherkey': 'othervalue', 'name': 'one project', 'id': 9}

You can use the in -operator to find out whether a substring is contained in another string).

For the Unicode and string types, x in y is true if and only if x is a substring of y . An equivalent test is y.find(x) != -1 . [...] Empty strings are always considered to be a substring of any other string, so "" in "abc" will return True .

You can use this to implement your keyword sorting key.

You'd use the approach given in the other answer (pass a tuple as key) to implement the alphabetical sorting as a secondary key.

Here's an example:

import pprint

# Define the keywords I want to see first
preferred_projects = ['one', 'two', 'three']

# example data
AllMyProjectsFromaDatasource = [{ 'name': 'project two', 'id': 5, 'otherkey': 'othervalue'},
                                { 'name': 'project three', 'id': 1, 'otherkey': 'othervalue'},
                                { 'name': 'project one', 'id': 3, 'otherkey': 'othervalue'},
                                { 'name': 'abc project', 'id': 6, 'otherkey': 'othervalue'},
                                { 'name': 'one project', 'id': 9, 'otherkey': 'othervalue'}
                               ]    

def keyfunc(x):
    # keyword primary key
    # (add index to list comprehension when keyword is in name)
    preferred_key = [float(idx) 
                     for idx, i in enumerate(preferred_projects)
                     if i in x['name']]
    # found at least one match in preferred keywords, use first if any, else infinity
    keyword_sortkey = preferred_key[0] if preferred_key else float('inf')
    # return tuple to sort according to primary and secondary key
    return keyword_sortkey, x['name']

AllMyProjectsFromaDatasource.sort(key=keyfunc)

pprint.pprint(AllMyProjectsFromaDatasource)

The output is:

[{'id': 9, 'name': 'one project', 'otherkey': 'othervalue'},
 {'id': 3, 'name': 'project one', 'otherkey': 'othervalue'},
 {'id': 5, 'name': 'project two', 'otherkey': 'othervalue'},
 {'id': 1, 'name': 'project three', 'otherkey': 'othervalue'},
 {'id': 6, 'name': 'abc project', 'otherkey': 'othervalue'}]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM