from pandas import DataFrame,Series
import numpy
def avg_bronze_medal():
countries=['Russian Fed','Norway','Canada']
gold=[13,11,10]
silver=[11,5,10]
bronze=[9,10,5]
medal_counts={'country_name':Series(countries),'gold':Series(gold),'silver':Series(silver),'bronze':Series(bronze)}
df=DataFrame(medal_counts)
print df
print df['gold'].apply(numpy.mean, axis=1)
Last line is giving error as "IndexError: tuple index out of range". I need to use apply function in data frame and it should get average of columns gold,bronze and silver. In above example, I used only gold column. Please help me in fixing the error.
To get the mean of all three columns at the same time:
df[['gold', 'bronze', 'silver']].mean(axis=1)
But it confuses me as to why you would need the average medals awarded in the tournament... But I guess you need it for some reason!
Some additional notes the OP should be aware of:
.apply
is a method that works on rows or columns (default). If you call df.apply(func)
the function, func
will be applied to all columns, one column at a time. df.apply(func, axis=1)
will apply func
to all rows, one at a time. In case of pd.Series
since there is only one column, .apply
always works on rows. .apply
is useful if you have a complex custom function that you need to apply to either rows or columns. Some statistical measures, such as sum, mean, standard deviation, are common and have vectorized functions of their own. Therefore one can directly call them, like in the answer above.
Please read the docs linked in the above paragraph for further information.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.