with dataframe having data like below,
Time,Col2,Col3,Col4,Col5,Col6,Col7,Col8,Col9,Col10,Col11,Col12,Col13
05:17:55.703,,,,,,21,,3, 89,891,11,
05:17:55.703,,,,,,21,,3, 217,891,12,
05:17:55.703,,,,,,21,,3, 217,891,13,
05:17:55.703,,,,,,21,,3, 217,891,15,
05:17:55.703,,,,,,21,,3, 217,891,16,
05:17:55.703,,,,,,21,,3, 217,891,17,
05:17:55.703,,,,,,21,,3, 217,891,18,
05:17:55.707,,,,,,18,,3, 185,892,0,
05:17:55.707,,,,,,21,,3, 185,892,1,
05:17:55.707,,,,,,17,,3, 73,892,5,
05:17:55.707,,,,,,17,,3, 185,892,6,
05:17:55.707,,,,,,21,,3, 73,892,7,
05:17:55.708,268,4,28,-67.60,13,,2,,,,,2
05:17:55.711,,,,,,18,,3, 57,892,10,
05:17:55.711,,,,,,21,,3, 201,892,11,
05:17:55.711,,,,,,21,,3, 25,892,12,
05:17:55.723,,,,,,21,,3, 217,893,11,
05:17:55.723,,,,,,21,,3, 217,893,15,
05:17:55.723,,,,,,21,,3, 217,893,16,
05:17:55.726,268,4,,-67.80,,,,,,,,
05:17:55.728,,,28,,12,31,2,3, 185,894,0,1
Need to do aggregation on each column with a different agg function. That is done like below.
df['Time'] = pd.to_timedelta(df['Time'])
d = {'Col2':'mean', 'Col3':'max', 'Col5':'median'}
df2 = df.groupby(pd.Grouper(freq='40L', key='Time')).agg(d)
Now, for another column, say Col1
I need to pass a custom mode function like below
def mode1(x):
m = pd.Series.mode(x)
return m.values[0] if not m.empty else np.nan
I can add mode1
to the dictionary like below and the aggregation works.
aggDict = {'Col1': mode1, 'Col2':'mean', 'Col3':'max', 'Col5':'median'}
d = {'Col2':'mean', 'Col3':'max', 'Col5':'median'}
df2 = df.groupby(pd.Grouper(freq='40L', key='Time')).agg(aggDict)
Further to this, I need read this dictionary from a config file so as to use it with different data frames with diff column names and agg methods respectively.
So I create a config file say config.ini
like below and use it with ConfigParser
config.ini
[Config1]
# for PDSCH and CSF info Apex custom grid
Col1 = mode1
Col2 = mean
Col3 = max
Col4 = median
read the config file
from configparser import ConfigParser
cfgparser = ConfigParser()
cfgparser.optionxform = str # to keep case sensitive keys
cfgparser.read('config.ini')
aggDict = dict(cfgparser.items('Config1'))
When I pass the aggDict to .agg()
function like df2 = df.groupby(pd.Grouper(freq='40L', key='Time')).agg(aggDict)
it complains 'SeriesGroupBy' object has no attribute 'mode1'
.
I know the problem here -it is that aggDict looks like below (and rightly so)
{'Col1': 'mode1',
'Col2': 'mean',
'Col3': 'max',
'Col4': 'median'}
When mode1
passed as a string, SeriesGroupBy
cannot find it. How to go about this such that SeriesGroupBy
can find the user defined mode1
function when passed from configParser
?
I think you need to call it from your globals or locals depending on the scope. So that would mean:
aggDict = {'Col1': globals()['mode1'], 'Col2':'mean', 'Col3':'max', 'Col5':'median'}
What you are doing is that you are calling a custom function to pass using the globals()
. This is assuming that you have the function in the same class or file. When you parse it into the aggDict
dictionary, use the format in the code above.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.