简体   繁体   中英

python: finding matches in a data structure

I have a data structure in this form:

Song = namedtuple('Song', ['fullpath', 'tags']) # tags is a dictionary
Album = namedtuple('Album', ['album_key', 'songs']) 

The data_structure is a list of Albums

There are thousands of albums, with 10-20 songs in each

I'm looking for matches:

for new_album in new_albums:
    for old_album in old_albums:
        if new_album.album_key == old_album.album_key:
            for new_song in new_album.songs:
                for old_song in old_album.songs:
                    if new_song.fullpath == old_song.fullpath:
                        # do something
                        break

This is inefficient, mainly because it restarts the loop through old_album for each new_album. One solution is to use dictionaries, but I need to sort and OrderedDict is only ordered by key insertion. Another is to change the list to a dictionary, process, then change back to a list, but that does not seem ideal.

Is there a better way?

You don't have to convert the data into a new format, but you can still use a dict for finding matches:

paths = {}

for album, a_id in zip(albums, xrange(len(albums))):
    for song, s_id in zip(album.songs, xrange(len(album.songs))):
         if song.fullpath not in paths:
              paths[song.fullpath] = (a_id, s_id)
         else:
              # do something
              break

when you get to #do something you can use the paths[song.fullpath] to give you [0] (the album) and [1] the song that matches. so:

 matched_album, matched_song = paths[song.fullpath]
 print albums[matched_album].songs[matched_song], "matches!"

Does this help?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM