pandas custom aggregation function

Question

I have a pandas dataframe, which the following command works on:

house.groupby(['place_name'])['index_nsa'].agg(['first','last'])

It gives me what I want. Now I want to make a custom aggregation value that gives me the percentage change between the first and the last value.

I got an error for doing math on the values, so I assumed that I have to turn them into numbers.

house.groupby(['place_name'])['index_nsa'].agg({"change in %":[(int('last')-int('first')/int('first')]})

Unfortunately, I only get a syntax error on the last bracket, which I cannot seem to find the error.

Does someone see where I went wrong ?

Answer 1

You will need to define and pass a callback to agg here. You can do that in-line with a lambda function:

house.groupby(['place_name'])['index_nsa'].agg([
    ("change in %", lambda x: (x.iloc[-1] - x.iloc[0]) / x.iloc[0])])

Look closely at .agg call—to allow renaming the output column, you must pass a list of tuples of the format [(new_name, agg_func), ...] . More info here .

If you want to avoid the lambda at the cost of some verbosity, you may use

def first_last_pct(ser):
    first, last = ser.iloc[0], ser.iloc[-1]
    return (last - first) / first

house.groupby(['place_name'])['index_nsa'].agg([("change in %", first_last_pct)])

pandas custom aggregation function

Question

1 answers

solution1
1 ACCPTED 2019-06-23 02:37:14

pandas custom aggregation function

Question

1 answers

solution1 1 ACCPTED 2019-06-23 02:37:14

solution1
1 ACCPTED 2019-06-23 02:37:14