简体   繁体   中英

Count occurrence in sub-list

I'm new for python,Can you please help me implement this program using python.

I have a list: The input

list=[["word1", "word2"],["word1", "word2","word1"], ["word4", "word5","word4", "word5", "word2", "word3"]]

The output:

out=[
   {"word1", "word2",2}, {"word2", "word1",1},{"word4", "word5",2},{"word5", "word4",1},
{"word5", "word2",1},{"word2", "word3",1}

]

how it works: it calculates the successive words in a sub-list, and it updates if it finds a new occurrence in a other sub-list

If I understand you right, you want to count sorted pairs of words across all lists. You can do so, among others, with a collections.Counter and a generator expression.

>>> import collections
>>> lst = [["word1", "word2"],
...        ["word1", "word2", "word1"],
...        ["word4", "word5", "word4", "word5", "word2", "word3"]]
...
>>> collections.Counter((a, b) for l in lst for a, b in zip(l, l[1:]))
Counter({('word1', 'word2'): 2,
         ('word2', 'word1'): 1,
         ('word4', 'word5'): 2,
         ('word5', 'word4'): 1,
         ('word5', 'word2'): 1,
         ('word2', 'word3'): 1})

The format of the result is a bit different as in your example, but also a lot more useful. If you actually want a list of sets instead, you can use [{a,b,c} for (a, b), c in _.items()] . Note that the sets might be "ordered" differently though (or actually not ordered at all).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM