简体   繁体   中英

Cannot called a function using .agg method in pandas?

I am trying to finish a Pandas course using Python on DataCamp and got into an issues. I got the solutions but I just want to ask. The quiz is simple: Using a numpy functions on a group of data

This is their suggested tips to complete this small quiz:

.agg() can take in a list of functions. The functions shouldn't be called, so don't use parentheses with them.

This was my code to find min, max, median of weekly_sales of each store type

sales_stats = sales.groupby("type")["weekly_sales"].agg([np.min(), np.max(), np.mean(), np.median()])

and this is the error:

File "<stdin>", line 4, in mean
TypeError: _mean_dispatcher() missing 1 required positional argument: 'a'

so I changed it to:

sales_stats = sales.groupby("type")["weekly_sales"].agg([np.mean(sales["weekly_sales"]),np.median,np.min,np.max])

but another errors occur, so I look at the solutions:

sales_stats = sales.groupby("type")["weekly_sales"].agg([np.min, np.max, np.mean, np.median])

Does that mean that we don't have to pass any arguments to these numpy methods? and the.agg functions will pass the "weekly_sales" as an argument to every of them? If so, if I want to pass two arguments to these methods, for example monthly_sales Is this a right way?

sales_stats = sales.groupby("type")["weekly_sales","monthly_sales"].agg([np.min, np.max, np.mean, np.median])

You're very close, but the correct syntax would be:

sales_stats = (
    sales.groupby("type")[["weekly_sales","monthly_sales"]]
    .agg([np.min, np.max, np.mean, np.median])
)

This is because, selecting multiple columns from a DataFrame or in this case a Groupby object, requires a list of column names. This snippet will return the minimum, maximum, mean, and median of both the "weekly_sales" and "monthly_sales" columns- groupby by "type".

Does that mean that we don't have to pass any arguments to these numpy methods? and the.agg functions will pass the "weekly_sales" as an argument to every of them? If so, if I want to pass two arguments to these methods, for example monthly_sales Is this a right way?

The arguments (each sub-array in this case) are passed under the hood by pandas to the aggregating functions.

If you want some more fine-grained control, you can pass a dictionary like so:

sales_stats = (
    sales.groupby("type")
    .agg({
        "weekly_sales": np.mean, 
        "monthly_sales": [np.min, np.max]
    })
)

This will return the mean of "weekly_sales" as well as the min & max of "monthly_sales". Check out some of the examples from the [

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM