简体   繁体   中英

How do I compare and group equivalent items in the same list in Python?

Note: I am using Python 3.4

I currently have a list of lists containing the following objects:

class word(object): #object class

    #each word object has 3 attributes (self explanatory)
    def __init__(self, originalWord=None, azWord=None, wLength=None):
        self.originalWord = originalWord
        self.azWord = azWord    #the originalWord alphabetized
        self.wLength = wLength

I want to iterate throughout the list to see if 2 consecutive items have the same azWord attribute. Eg bat and tab would both have azWord "abt", so they would be anagrams. The end goal is to group anagrams and print them to a file. The lists are grouped by word lengths and each list is alphabetized by each object's azWord. If words are anagrams, I want to add them to a temporary list. I want to do this by comparing the current item I'm looking at to the next one. If they are identical, I want to add them to a temporary list. When I encounter an item that is not longer identical, I would like to print my collection of anagrams to a file and begin a new temp list to continue checking for anagrams. This is what I have thus far:

for row in results:
    for item in row:
        if <<current item is identical to next time>>:
            tempList = []   
            <<add to tempList>>
        else
            tempList[:]=[]

I'm not quite sure how to structure this such that things don't get written twice (eg cat, tab, tab, abt) or erasing things before printing them to file.

You're probably looking for something like this:

from collections import defaultdict
anagrams = defaultdict(list)
for word in results:
    anagrams[word.azWord].append(word)

This is slightly different than your original implementation because in the above case, it doesn't matter if the anagrams are out of order (That is, all anagrams need not be right next to each other).

On a side note, you could probably structure your word class more efficiently like so:

# As a convention in python, class names are capitalized
class Word(str):
    def az(self):
        return ''.join(sorted(self))

Then you're code would look like:

from collections import defaultdict
anagrams = defaultdict(list)
for word in results:
    anagrams[word.az()].append(word)

To elaborate on Adam Smith's comment... you probably want something like this:

import itertools
list_of_words.sort( key = lambda i: i.azWord )
[ list(items) for azword,items in itertools.groupby( x, lambda i: i.azWord )]

Eg. So if you had the follow

x = [ x1, x2, x3, x4 ]  # where x1 & x4 have the same azWords

Then you'd get the desired grouping (sorted based on azWord):

[ [x1,x4], [x2], [x3] ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM