Pandas Groupby/Grouper group by starting index value

Question

How do adjust the starting time in Grouper?

Starting with this sample DF:

import datetime as DT
df = pd.DataFrame({
'Buyer': 'Carl Mark Carl Joe Joe Carl'.split(),
'Quantity': [1,3,5,8,9,3],
'Date' : [
DT.datetime(2013,1,1,13,0),
DT.datetime(2013,3,1,13,5),
DT.datetime(2013,5,1,20,0),
DT.datetime(2013,8,2,10,0),
DT.datetime(2013,9,2,12,0),                                      
DT.datetime(2013,11,2,14,0),
]})
df = df.set_index('Date')

df.groupby(pd.Grouper(freq='1MS'))["Quantity"].count()

   Date
2013-01-01    1
2013-02-01    0
2013-03-01    1
2013-04-01    0
2013-05-01    1
2013-06-01    0
2013-07-01    0
2013-08-01    1
2013-09-01    1
2013-10-01    0
2013-11-01    1

df.groupby(pd.Grouper(freq='2MS'))["Quantity"].count()

   Date
2013-01-01    1
2013-03-01    1
2013-05-01    1
2013-07-01    1
2013-09-01    1
2013-11-01    1

What I was looking for is "2MS" from index date using Grouper or TimeGrouper . The above is returning "2MS" from first value in the index or 1/1/2013. How do I get 2MS from '8/1/2013' for 2.

Targeting:

     Date
2013-01-01    1
2013-03-01    1
2013-05-01    1
2013-08-01    2
2013-09-01    1
2013-11-01    1

Notes:

What I'm trying to do groupby's based on index values.. -- 1st groupby would start slice from 1/1. The 2nd slice would start from 3/1, the 3rd from 5/1. The end period would be 2MS. Now using Grouper, it starts the slicing from the first date and continues in two month intervals. The fourth interval should start on 8/1 end 10/2. Right now, 8/2 starts on 7/1.

Answer 1

You want a forward rolling window while pandas makes backwards rolling windows. So the idea is to reversed the ordering of your series, take a rolling window and then revert the ordering.

This is what you already had:

from datetime import datetime

import pandas as pd

df = pd.DataFrame({'Buyer': 'Carl Mark Carl Joe Joe Carl'.split(),
                   'Quantity': [1, 3, 5, 8, 9, 3],
                   'Date' : [datetime(2013, 1, 1, 13, 0),
                             datetime(2013, 3, 1, 13, 5),
                             datetime(2013, 5, 1, 20, 0),
                             datetime(2013, 8, 2, 10, 0),
                             datetime(2013, 9, 2, 12, 0),                                      
                             datetime(2013, 11, 2, 14, 0)]})
df = df.set_index('Date')
print(df)

#                     Buyer  Quantity
# Date                               
# 2013-01-01 13:00:00  Carl         1
# 2013-03-01 13:05:00  Mark         3
# 2013-05-01 20:00:00  Carl         5
# 2013-08-02 10:00:00   Joe         8
# 2013-09-02 12:00:00   Joe         9
# 2013-11-02 14:00:00  Carl         3

g1 = df.resample('MS')["Quantity"].count()
print(g1)

# Date
# 2013-01-01    1
# 2013-02-01    0
# 2013-03-01    1
# 2013-04-01    0
# 2013-05-01    1
# 2013-06-01    0
# 2013-07-01    0
# 2013-08-01    1
# 2013-09-01    1
# 2013-10-01    0
# 2013-11-01    1
# Freq: MS, Name: Quantity, dtype: int64

And this is how to get to the finish line:

g2 = g1.sort_index(ascending=False).rolling(2, 0).sum().sort_index()
print(g2)

# Date
# 2013-01-01    1.0
# 2013-02-01    1.0
# 2013-03-01    1.0
# 2013-04-01    1.0
# 2013-05-01    1.0
# 2013-06-01    0.0
# 2013-07-01    1.0
# 2013-08-01    2.0
# 2013-09-01    1.0
# 2013-10-01    1.0
# 2013-11-01    1.0
# Freq: MS, Name: Quantity, dtype: float64

print(g2[g1 != 0].astype(int))

# Date
# 2013-01-01    1
# 2013-03-01    1
# 2013-05-01    1
# 2013-08-01    2
# 2013-09-01    1
# 2013-11-01    1
# Name: Quantity, dtype: int64

Pandas Groupby/Grouper group by starting index value

Question

1 answers

solution1
0 2016-07-20 22:37:24

Pandas Groupby/Grouper group by starting index value

Question

1 answers

solution1 0 2016-07-20 22:37:24

solution1
0 2016-07-20 22:37:24