简体   繁体   中英

Group and compute the average in list of tuples

I have a list of tuples like this:

x=[('HSBC8999', 4, 179447), ('HSBC1199', 81, 864108), ('HSBC1199', 32, 715121),('HSBC8999', 4, 1447),('HSBC1199', 32, 61521) ]

I want to perform few task:

  1. group the list according to the 1st item :HSBCXXXX

  2. in each group,compute the average values for 3rd item in the list which having the same 2nd item.

something like this: group 1:

('HSBC8999', 4, 179447)
('HSBC8999', 4, 1447)

average for 4 : (179447+1447)/2

group 2:

('HSBC1199', 81, 864108)
('HSBC1199', 32, 715121)
('HSBC1199', 32, 61521)

Average for 81: 864108

Average for 32= (715121+61521)/2

import itertools
import operator

L = [('HSBC8999', 4, 179447), ('HSBC1199', 81, 864108), ('HSBC1199', 32, 715121),('HSBC8999', 4, 1447),('HSBC1199', 32, 61521) ]

L.sort(key=operator.itemgetter(0))
for _k, stackoverflow in itertools.groupby(L, operator.itemgetter(0)):
    subl = list(stackoverflow)
    subl.sort(key=operator.itemgetter(1))
    for k, subg in itertools.groupby(subl, operator.itemgetter(1)):
        subs = list(subg)
        print("the average of {} is {}".format(k, sum(s[2] for s in subs)/len(subs)))

Using nested defaultdict with float

from collections import defaultdict

l = [('A1', 'A', 342.5), ('A2', 'A', 509.70), ('A2', 'B', 119.34),
     ('A1', 'B', 618.42), ('A1', 'A', 173.54), ('A1', 'B', 235.21)]

d = defaultdict(lambda: defaultdict(lambda: defaultdict(float)))

for a,b,c in l:
    d[a][b]['sum'] += c
    d[a][b]['count'] += 1
    d[a][b]['average'] += (c - d[a][b]['average'])/d[a][b]['count']

We use the fact that the average can be calculates as (see: https://math.stackexchange.com/posts/957376/ )

在此处输入图片说明

Returns the following structure:

{
  "A1": {
    "A": {
      "sum": 516.04,
      "count": 2.0,
      "average": 258.02
    },
    "B": {
      "sum": 853.63,
      "count": 2.0,
      "average": 426.815
    }
  },
  "A2": {
    "A": {
      "sum": 509.7,
      "count": 1.0,
      "average": 509.7
    },
    "B": {
      "sum": 119.34,
      "count": 1.0,
      "average": 119.34
    }
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM