简体   繁体   中英

Python: Counting unique instances of lists/DataFrame in a list of lists/DataFrames

I have a for loop that creates - let's say - 1000 lists. The generation of these lists are slightly randomized, so there is a difference between the generated lists, but there will also be some that overlap. And I want to count how many times a unique list occurs, that is, how many times a given list overlaps with another generated list.

Each item in the list is formatted as follows:

TeamRecord(name='GER', group='F', p=9, gs=6, ga=2, defeated=['SWE', 'MEX', 'KOR']),

If it helps, here's the context: As the list item might indicate, I'm simulating the soccer World Cup group stages and each simulation results in a list containing each teams' performance for that given simulation. So I want to see, given for example 10000 simulations, which outcomes are most likely given how many times they occur in the simulation.

I think this is more of an abstract question, and I don't really have any code to provide that would be useful. I did try to tinker a bit with converting the lists to DataFrames and thought of using the .equals method, but I'm not sure how that could be done effectively.

So again, the question is:

How would you go about counting the occurence of each unique instance of a list generated by a for-loop - that is, all the items in the list should be identicel with another generated list. Is this even possible to do, or is it simply a dumb way of looking at it?

EDIT Simple example illustrating the purpose:

list_of_lists = [['Test1', 'Test2', 'Test3'],
                ['Test1', 'Test2', 'Test3'],
                ['Test4', 'Test5', 'Test6']]

How would you go about counting that there are two instances of the first two lists, 1 of the third list and so on.

Any solution will be specific to the type of object you are counting. I deal only with the specific example you have highlighted, ie a list of lists of strings.

You can use collections.Counter on tuple versions of your sublists. This works because tuples are hashable while lists are not.

from collections import Counter

L = [['Test1', 'Test2', 'Test3'],
     ['Test1', 'Test2', 'Test3'],
     ['Test4', 'Test5', 'Test6']]

res = Counter(map(tuple, L))

print(res)

Counter({('Test1', 'Test2', 'Test3'): 2,
         ('Test4', 'Test5', 'Test6'): 1})

For simple diagnostics of multiple entries:

from collections import Counter

lists = [['Test1', 'Test2', 'Test3'],
         ['Test1', 'Test2', 'Test3'],
         ['Test4', 'Test5', 'Test6']]

def hasher(x):
    return ''.join(x)

hashed = [hasher(x) for x in lists]
cnt = Counter(hashed)
print(cnt)

# you can reverse to to original list 
# if you world to combine it with the count, in some fashion
lists2 = [(cnt[hasher(x)], x) for x in lists] 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM