简体   繁体   中英

Rearrange words in a string based on which category set they belong to

How would I rearrange a string based on what category it belongs to? Let's say I have these sets:

dogs = {'husky', 'chihuahua', 'labrador', 'beagle'} 
flowers = {'dandelion', 'rose', 'tulip'} 
colours = {'blue', 'yellow', 'green', 'red', 'pink'}

Then let's say I wanted to input a string and rearrange the words based on their category.

'husky tulip red orange'

would become

'red orange husky tulip'

The order would be colours first, then dogs, then flowers. Maybe create a list of the categories in order? Not too sure how I would go about this

Use a key function with sorted :

def ref(s):
    dogs = {'husky', 'chihuahua', 'labrador', 'beagle'} 
    flowers = {'dandelion', 'rose', 'tulip'} 
    colours = {'blue', 'yellow', 'green', 'red', 'pink', 'orange'}
    if s in colours: rtr=-3
    elif s in dogs: rtr=-2
    elif s in flowers: rtr=-1
    else: rtr=0   # this puts words not found at end of string
    return rtr 

s='husky tulip red orange'

>>> ' '.join(sorted(s.split(), key=ref))
red orange husky tulip

More Pythony (and easier to extend) is to do something like this:

def ref(s):
    dogs = {'husky', 'chihuahua', 'labrador', 'beagle'} 
    flowers = {'dandelion', 'rose', 'tulip'} 
    colours = {'blue', 'yellow', 'green', 'red', 'pink', 'orange'}
    key_t=(colours, dogs, flowers)
    try: 
        return next(i for i, v in enumerate(key_t) if s in v)
    except StopIteration:
        return -1. # this puts words not found at beginning of string
    # or use the default argument version of next:
    # return next((i for i, v in enumerate(key_t) if s in v), -1)

And use that key function the same way.

You can also iterate the sets by using chain to chain the sets together into a single iterable:

>>> from itertools import chain
>>> [e for e in chain(colours,dogs,flowers) if e in s.split()]
['orange', 'red', 'husky', 'tulip']

Which is faster or better depends on the size of the string and the size of the sets. Also if you wanted to do secondary sorts (such as lexicographic within the individual categories) you need to use the sorted method.

Try this

#Define all the categories
dogs = ['husky', 'chihuahua', 'labrador', 'beagle']
flowers = ['dandelion', 'rose', 'tulip']
colours = ['blue', 'yellow', 'green', 'red', 'pink', 'orange']

#The Input String
outOfOrder = "husky tulip red orange"

#Split up the string into an array which each word seperated
outOfOrderArray = outOfOrder.split()

#Array to hold all words of each category
orderedArray = [[], [], [], []]

#loop through all the words in the array
for word in outOfOrderArray:

    #Check if the word is in each category.
    if word in dogs:
        orderedArray[2].append(word)
    elif word in flowers:
        orderedArray[1].append(word)
    elif word in colours:
        orderedArray[0].append(word)

    #If its not in the array, do whatever you want with it. I jsut stuck them at the end.
    else:
        orderedArray[3].append(word)

orderedString = ""

#Combine all the words in ordered Array to create a final string
for category in orderedArray:
    for word in category:
        orderedString = orderedString + word + " "

print(orderedString)

You could push flowers to the back with key 1 and pull colours to the front with key -1 (and everything else will go in the middle with key 0 ):

>>> ' '.join(sorted(s.split(), key=lambda w: (w in flowers) - (w in colours)))
'red husky orange tulip'

Note that "orange" isn't in any of your categories, which is why it ended up in the middle along with "husky".

A more general way pushes flowers furthest back, dogs less back, and colours least back:

>>> ' '.join(sorted(s.split(), key=lambda w: (w in flowers, w in dogs, w in colours)))
'orange red husky tulip'

Simply pluck out the matching strings from collection in order:

>>>[i for i in list(colours)+list(flowers)+list(dogs) if i in input_list]
['red', 'orange', 'tulip', 'husky']

No need to perform sorting or anything.

Even simpler if you originally define them as lists instead of sets. Plus it will retain the order you will put in the lists instead of doing a collective sorting of all items:

dogs = ['husky', 'chihuahua', 'labrador', 'beagle']
flowers = ['dandelion', 'rose', 'tulip'] 
colours = ['blue', 'yellow', 'green', 'red', 'pink', 'orange']

input_list = 'husky tulip red orange pink blue'.split()

[i for i in colours+flowers+dogs if i in input_list]

outputs in:

['blue', 'red', 'pink', 'orange', 'tulip', 'husky'] # note the ordering

PS Seems most pythonic, most time-space efficient, most scalable and the fastest approach to me.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM