简体   繁体   中英

Find Min, Max and Average of two fields in dictionary

After process the data, I have a batch of rows with the format

(u'378491520468_sale', {'price': 2100000, 'built': 3815})

(u'378491119.1537520468_sale', {'price': 2100000, 'built': 3815})

(u'1306084076.1535728358_rent', {'price': 1400, 'built': 1109})

(u'1303342766.1548320090_sale', {'price': 550, 'built': 1200})

(u'1890530682.1515660872_sale', {'price': 130000, 'built': 759})

(u'8212134.1548317851_rent', {'price': 2900, 'built': 1220})

(u'1170655463.1513653914_sale', {'price': 430000, 'built': 1142})

(u'58676746.1548308550_sale', {'price': 1700000, 'built': 3000})

(u'1162578480.1474216313_sale', {'price': 10000000, 'built': 3})

(u'1860145003.1546594155_rent', {'price': 4200, 'built': 839})

(u'1640943061.1489124089_sale', {'price': 710000, 'built': 1600})

(u'1008351255.1547539066_rent', {'price': 15000, 'built': 8400})

(u'903442891.1547795833_sale', {'price': 148000, 'built': 786})

where the first element in the set is the unique ID.

I know about the basic combineFn class that able to group (key, value) and count the min, max and average in a fixed window. But with a dictionary as value, I need some guidance to compute them with a format of:

("the_unique_id", {
            "price":{
                "min": 0,
                "max": 0,
                "average": 0
            },
            "built": {
                "min": 0,
                "max": 0,
                "average": 0
            }
        ), ...

If you can get the data into the form below, here is a way to calculate the aggregate values:

import pandas as pd

data = {'ID': [u'378491520468_sale', u'378491119.1537520468_sale', u'1306084076.1535728358_rent'],
        'price': [2100000, 2100000, 1400],
        'built': [3815, 3815, 1109]}

df = pd.DataFrame(data)

aggregates = {
    'price': ['min', 'max', 'mean'],
    'built': ['min', 'max', 'mean'],
}

df = df.groupby('ID').agg(aggregates)

res = []

for i in range(len(df)):
    row = df.iloc[i]
    res.append((row.name,
                {'price': {'min': row['price']['min'],
                           'max': row['price']['max'],
                           'average': row['price']['mean']},
                 'built': {'min': row['built']['min'],
                           'max': row['built']['max'],
                           'average': row['built']['mean']}}))

print(res)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM