Increase limit of value when infinity is reached in Pandas

Question

Data Structure:

HEIGHT Category
   51        1
   45        1
   89        2

Objective: Calculate Geometric Mean

import pandas as pd
import numpy as np
df = pd.read_csv('BaseFish',delimiter=',')
df.dropna(axis = 0)
df = df[df.HEIGHT != 0]
table = pd.pivot_table(df,values = 'HEIGHT',index = 'Category',aggfunc=(np.prod,np.count_nonzero))
table.insert(2,'GMEAN',0)
table['GMEAN']=table['prod']**(1/table['count_nonzero'])

Problem: Categories with a large number of data point produces np.prod = infinity. Hence the final GMEAN is also infinity.

My python knowledge is very basic and the only reason I am using it because the number of data points exceeds excels limit.

Answer 1

There is no need to use a pivot table here. You can group by category and then compute the geometric mean per category.

from scipy.stats import gmean
df.groupby('category').height.apply(gmean)

Or without importing spicy.stats :

gmean = lambda group: group.prod()**(1/len(group))
df.groupby('category').height.apply(gmean)

Increase limit of value when infinity is reached in Pandas

Question

1 answers

solution1
3 ACCPTED 2018-10-10 07:20:06

Increase limit of value when infinity is reached in Pandas

Question

1 answers

solution1 3 ACCPTED 2018-10-10 07:20:06

solution1
3 ACCPTED 2018-10-10 07:20:06