简体   繁体   中英

Aggregate sets in pandas

I have a table made like this:

col1    col2
a       {...}
a       {...}
b       {...}
c       {...}
c       {...}
c       {...}

Where col2 is made up by sets. I need to aggregate by col1 such that col2 is the union of the sets.

My best attempt so far was this:

def set_union(*sets):
    return reduce(lambda a, b: a.union(b), sets)

mytable.groupby('col1', as_index=False)['equivalente_new'].agg(set_union)

Which yields:

ValueError: Must produce aggregated value

Does anyone have any solution?

Remove the splat in your function signature

def set_union(sets):
    return reduce(lambda a, b: a.union(b), sets)

mytable.groupby('col1', as_index=False).agg(set_union)

  col1       col2
0    a     {1, 2}
1    b        {3}
2    c  {4, 5, 6}

I like this better (without the reduce)

def set_union(sets):
    return set().union(*sets)

mytable.groupby('col1', as_index=False).agg(set_union)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM