简体   繁体   中英

Count unique values based on a grouping field

I'm trying to create a function which count the unique values from a list based on another grouping field. Below is presented my example data, the listaa[i][0] represents the grouping field, and the listaa[i][2] is the number that must be grouped.

listaa = [(u'2004-2006', 48600.0, 386011),
 (u'2004-2006', 900.0, 385792),
 (u'2004-2006', 16200.0, 385792),
 (u'2004-2006', 11700.0, 385792),
 (u'2004-2006', 900.0, 385792),
 (u'2006-2008', 900.0, 386198),
 (u'2006-2008', 39600.0, 385916),
 (u'2006-2008', 4500.0, 385916),
 (u'2006-2008', 900.0, 385916),
 (u'2006-2008', 900.0, 385916),
 (u'2008-2010', 11700.0, 386067)]

This is my code, and it's working. What I want is to know if there is a simpliest way to do the same thing.

fechas = list(set([f[0] for f in listaa]))
fechas.sort()
lista1 = []
lista2 = []
for fecha in fechas:
    for l in listaa:
        if l[0] == fecha:
            lista1.append(l[2])
    lista2.append(str(len(set(lista1))))
    lista1 = []
print lista2

The expected result should be: ["2", "2", "1"] .

You can use a defaultdict to easily tally unique values per group. (On mobile, sorry for no example output.)

from collections import defaultdict 

values = defaultdict(set) 
for row in data:
  values[row[0]].add(row[2])

Offering a pandas solution that leverages nunique() :

import pandas as pd

listaa = [(u'2004-2006', 48600.0, 386011),
 (u'2004-2006', 900.0, 385792),
 (u'2004-2006', 16200.0, 385792),
 (u'2004-2006', 11700.0, 385792),
 (u'2004-2006', 900.0, 385792),
 (u'2006-2008', 900.0, 386198),
 (u'2006-2008', 39600.0, 385916),
 (u'2006-2008', 4500.0, 385916),
 (u'2006-2008', 900.0, 385916),
 (u'2006-2008', 900.0, 385916),
 (u'2008-2010', 11700.0, 386067)]

df = pd.DataFrame(listaa, columns=['Date','Val1','Val2'])

df.groupby('Date')['Val2'].nunique().tolist()

Gives:

[2, 2, 1]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM