简体   繁体   中英

Python: Sort list of similar strings based on another string

I have a string

deplete mineral resources , from 123 in x 123 in x 19 ft , on 24 ft t shaped hole

and a list of strings

['123', '123', '19', '24', 'in', 'in', 'ft', 'ft', 'deplete mineral', 't', 'resources', 'shaped hole']

I want to sort this list based on the given string. When I did sorted(l, key=s.index) , I am getting the output as:

['deplete mineral', 't', 'in', 'in', 'resources', '123', '123', '19', 'ft', 'ft', '24', 'shaped hole']

But my desired output is:

['deplete mineral', 'resources', '123', 'in' , '123', 'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']

The list should be sorted exactly as the string given. Is there an efficient way to achieve this?

This produces the desired pattern. It's not technically a sort though - just a regular expression search of the sort string.

>>> import re
>>>
>>> sort_str = "deplete mineral resources , from 123 in x 123 in x " \
...            "19 ft , on 24 ft t shaped hole"
>>> 
>>> str_list = ['123', '123', '19', '24', 'in', 'in', 'ft', 'ft', 
...             'deplete mineral', 't', 'resources', 'shaped hole']
>>> 
>>> re.findall('|'.join(str_list), sort_str)
['deplete mineral', 'resources', '123', 'in', '123', 'in', '19', 
 'ft', '24', 'ft', 't', 'shaped hole']
>>>
>>>
>>> desired = ['deplete mineral', 'resources', '123', 'in' , '123', 
...            'in' , '19', 'ft', '24', 'ft', 't' , 'shaped hole']
>>> desired == re.findall('|'.join(str_list), sort_str)
True

The regular expression is simple. It's of the form "alt_1|alt_2|alt_3" . What that OR-like expression produces is a pattern matcher that scans a string looking for the substrings "alt_1", "alt_2", or "alt_3".

str_list is joined together to form this OR-like expression in this simple fashion:

>>> '|'.join(str_list)
'123|123|19|24|in|in|ft|ft|deplete mineral|t|resources|shaped hole'

The ordering of the above expression isn't important - they could be in any order.

This string expression is turned into a regular expression internally when passed in as the first parameter to re.findall() and used to find all matching substrings in sort_str with the following line:

>>> re.findall('|'.join(str_list), sort_str)

re.findall() scans sort_str from beginning to end looking for substrings that are part of str_list . Each occurrence is added to the list it returns.

So the substrings matched will be in the same order as the words in sort_str .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM