I can't understand the code:
pivot = pd.pivot_table(subset, values='count', rows=['date'], cols=['sample'], fill_value=0)
by = lambda x: lambda y: getattr(y, x)
grouped = pivot.groupby([by('year'),by('month')]).sum()
subset
in the code is a DataFrame which have a column named "date"(eg2013-02-04 06:20:49.634244), and do not have a column named "year" and "month".
where I have trouble with
I can't figure out the "year" and "month" in:
grouped = pivot.groupby([by('year'),by('month')]).sum()
What the meaning of
grouped = pivot.groupby([by('year'),by('month')]).sum()
What I have done:
In the pandas pandas document says: the first parame of the pandas.DataFrame.groupby can be
by : mapping function / list of functions, dict, Series, or tuple /
by = lambda x: lambda y: getattr(y, x)
means by('bar') returns a function that returns the attribute 'bar' from an object
If a callable is passed to groupby
, it is called on the DataFrame
's index, so this code is is grouping by the year and month of a datetimelike index.
In [55]: df = pd.DataFrame({'a': 1.0},
index=pd.date_range('2014-01-01', periods=13, freq='M'))
In [56]: df.groupby([by('year'), by('month')]).sum()
Out[56]:
a
2014 1 1.0
2 1.0
3 1.0
4 1.0
5 1.0
6 1.0
7 1.0
8 1.0
9 1.0
10 1.0
11 1.0
12 1.0
2015 1 1.0
More explicitly
In [57]: df.groupby([df.index.year, df.index.month]).sum()
Out[57]:
a
2014 1 1.0
2 1.0
3 1.0
4 1.0
5 1.0
6 1.0
7 1.0
8 1.0
9 1.0
10 1.0
11 1.0
12 1.0
2015 1 1.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.