简体   繁体   中英

Finding count of tuples with same first and third item in list of tuples

I have a list of tuples each with three items :

z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015), (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015), (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015), (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015), (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015), (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015), (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015), (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015), (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015), (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]

I want to find number of tuples in the list with same first and third items, like with first item 1 and third item 2015, there are 4 tuples; with first item 2 and third item 2015, there are 4 tuples.

I tried :

for tup in z:
    a=tup[0]
    b=tup[2]
    print(len(set({a:b})))

It doesn't give desired result. How to do it?

using standard python's itertools.groupby :

from itertools import groupby

for grp, elmts in groupby(z, lambda x: (x[0], x[2])):
    print(grp, len(list(elmts)))

Edit:

an even nicer solution by using operator.itemgetter instead of lambda :

from operator import itemgetter
from itertools import groupby

for grp, elmts in groupby(z, itemgetter(0, 2)):
    print(grp, len(list(elmts)))

Output:

(1, 2015) 4
(2, 2015) 4
(3, 2015) 5
(4, 2015) 4
(5, 2015) 5
(6, 2015) 4
(7, 2015) 4
(8, 2015) 5
(9, 2015) 4
(10, 2015) 4
(11, 2015) 5
(12, 2015) 4
(1, 2016) 5

Using collections.Counter with operator.itemgetter :

from collections import Counter
from operator import itemgetter

res = Counter(map(itemgetter(0, 2), z))

print(res)

Counter({(1, 2015): 4,
         (1, 2016): 5,
         (2, 2015): 4,
         (3, 2015): 5,
         (4, 2015): 4,
         (5, 2015): 5,
         (6, 2015): 4,
         (7, 2015): 4,
         (8, 2015): 5,
         (9, 2015): 4,
         (10, 2015): 4,
         (11, 2015): 5,
         (12, 2015): 4})

In pure python use Counter with generator, thanks @Felix:

from collections import Counter

out = Counter((x[0], x[2]) for x in z)
print (out)
Counter({(3, 2015): 5, 
         (5, 2015): 5, 
         (8, 2015): 5,
         (11, 2015): 5, 
         (1, 2016): 5, 
         (1, 2015): 4, 
         (2, 2015): 4, 
         (4, 2015): 4, 
         (6, 2015): 4, 
         (7, 2015): 4, 
         (9, 2015): 4, 
         (10, 2015): 4,
         (12, 2015): 4})

In pandas aggregate counts by GroupBy.size , output is Series :

s = pd.DataFrame(z).groupby([0,2]).size()
print (s)
0   2   
1   2015    4
    2016    5
2   2015    4
3   2015    5
4   2015    4
5   2015    5
6   2015    4
7   2015    4
8   2015    5
9   2015    4
10  2015    4
11  2015    5
12  2015    4
dtype: int64

Using collections .

Ex:

import collections
d = collections.defaultdict(int)
z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015), (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015), (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015), (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015), (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015), (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015), (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015), (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015), (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015), (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]
for i in z:
    d[(i[0], i[2])] += 1
print(d)

Output:

defaultdict(<type 'int'>, {(10, 2015): 4, (5, 2015): 5, (2, 2015): 4, (11, 2015): 5, (6, 2015): 4, (8, 2015): 5, (3, 2015): 5, (12, 2015): 4, (7, 2015): 4, (9, 2015): 4, (4, 2015): 4, (1, 2016): 5, (1, 2015): 4})

You can store the count in a dict, keyed by a tuple consisting of the first and third item from the original list of tuples, eg:

import collections

z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015),
     (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015),
     (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015),
     (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015),
     (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015),
     (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015),
     (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015),
     (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015),
     (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015),
     (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]

counter = collections.defaultdict(int)  # Use a dict factory to save some time
for element in z:  # iterate over the tuples
    counter[(element[0], element[2])] += 1  # increase the count for each match

# finally, lets print the results
for k, count in counter.items():
    print("{}: {}".format(k, count))

Which will give you:

(1, 2015): 4
(2, 2015): 4
(3, 2015): 5
(4, 2015): 4
(5, 2015): 5
(6, 2015): 4
(7, 2015): 4
(8, 2015): 5
(9, 2015): 4
(10, 2015): 4
(11, 2015): 5
(12, 2015): 4
(1, 2016): 5

Try this:

z = [(1, 4, 2015), (1, 11, 2015), (1, 18, 2015), (1, 25, 2015), (2, 1, 2015), (2, 8, 2015), (2, 15, 2015), (2, 22, 2015), (3, 1, 2015), (3, 8, 2015), (3, 15, 2015), (3, 22, 2015), (3, 29, 2015), (4, 5, 2015), (4, 12, 2015), (4, 19, 2015), (4, 26, 2015), (5, 3, 2015), (5, 10, 2015), (5, 17, 2015), (5, 24, 2015), (5, 31, 2015), (6, 7, 2015), (6, 14, 2015), (6, 21, 2015), (6, 28, 2015), (7, 5, 2015), (7, 12, 2015), (7, 19, 2015), (7, 26, 2015), (8, 2, 2015), (8, 9, 2015), (8, 16, 2015), (8, 23, 2015), (8, 30, 2015), (9, 6, 2015), (9, 13, 2015), (9, 20, 2015), (9, 27, 2015), (10, 4, 2015), (10, 11, 2015), (10, 18, 2015), (10, 25, 2015), (11, 1, 2015), (11, 8, 2015), (11, 15, 2015), (11, 22, 2015), (11, 29, 2015), (12, 6, 2015), (12, 13, 2015), (12, 20, 2015), (12, 27, 2015), (1, 3, 2016), (1, 10, 2016), (1, 17, 2016), (1, 24, 2016), (1, 31, 2016)]
newz = [(i[0],i[-1]) for i in z]
for i in list(set(newz)):
   print(str(i)+' '+str(newz.count(i)))

Output:

(10, 2015) 4
(5, 2015) 5
(2, 2015) 4
(11, 2015) 5
(6, 2015) 4
(8, 2015) 5
(3, 2015) 5
(12, 2015) 4
(7, 2015) 4
(9, 2015) 4
(1, 2016) 5
(4, 2015) 4
(1, 2015) 4

Solution other than groupby,

import pprint
import random

from collections import Counter

z = [] # creating random dates as user has 2 years, won't work if year range increases

num_dates = 20
counts_by_month_and_year = Counter()

while len(z) < num_dates:
    new = (random.randrange(1, 31), random.randrange(1, 12), random.randrange(2015, 2016))

    z.append(new)
    counts_by_month_and_year[(new[0], new[2])] += 1


pprint.pprint(dict(counts_by_month_and_year)) # formatting the output 
{(1, 2015): 1,
 (3, 2015): 1,
 (4, 2015): 1,
 (5, 2015): 1,
 (7, 2015): 1,
 (8, 2015): 2,
 (9, 2015): 1,
 (11, 2015): 1,
 (13, 2015): 1,
 (16, 2015): 1,
 (17, 2015): 1,
 (20, 2015): 1,
 (21, 2015): 2,
 (22, 2015): 1,
 (25, 2015): 1,
 (26, 2015): 1,
 (27, 2015): 2}

[Program finished] 

from collections import Counter tmp = [(x[0],x[2]) for x in z] print(Counter(tmp))

输出会像Counter({(5, 2015): 5, (11, 2015): 5, (8, 2015): 5, (3, 2015): 5, (1, 2016): 5, (10, 2015): 4, (2, 2015): 4, (6, 2015): 4, (12, 2015): 4, (7, 2015): 4, (9, 2015): 4, (4, 2015): 4, (1, 2015): 4})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM