[英]Pandas apply timegrouper on column
Lets consider the following dataframe: 让我们考虑以下数据帧:
data={'close': 1.16155,
'datetime': Timestamp('2017-11-01 22:29:40'),
'high': 1.16155,
'low': 1.16155,
'open': 1.16155,
'symbol': 'European Monetary Union Euro - United States dollar',
'volume': -1.0},
{'close': 1.00325,
'datetime': Timestamp('2017-11-01 22:29:40'),
'high': 1.00325,
'low': 1.00325,
'open': 1.00325,
'symbol': 'United States dollar - Swiss franc',
'volume': -1.0},
{'close': 1.324475,
'datetime': Timestamp('2017-11-01 22:29:40'),
'high': 1.324475,
'low': 1.324475,
'open': 1.324475,
'symbol': 'British pound - United States dollar',
'volume': -1.0},
{'close': 1.324475,
'datetime': Timestamp('2017-11-01 22:29:45'),
'high': 1.324475,
'low': 1.324475,
'open': 1.324475,
'symbol': 'British pound - United States dollar',
'volume': -1.0},
{'close': 1.16155,
'datetime': Timestamp('2017-11-01 22:29:45'),
'high': 1.16155,
'low': 1.16155,
'open': 1.16155,
'symbol': 'European Monetary Union Euro - United States dollar',
'volume': -1.0}]
df=pd.DataFrame(data)
I would like to use groupby to group by symbol
and datetime
, without setting the index as either of symbol
or datetime
. 我想使用groupby按
symbol
和datetime
进行分组,而不将索引设置为symbol
或datetime
。
Ideally the result should be something like that: df.groupby(["symbol",pd.TimeGrouper("30T","datetime")]).count()
. 理想情况下,结果应该是这样的:
df.groupby(["symbol",pd.TimeGrouper("30T","datetime")]).count()
。
df.set_index("datetime).groupby(["symbol",pd.TimeGrouper("30T")]).count()
But again, I would like to do it without setting the index to datetime
or symbol
. 但是,再次,我希望不将索引设置为
datetime
或symbol
。
Thx! 谢谢!
Is that what you want? 那是你要的吗?
In [198]: df.groupby(["symbol",pd.TimeGrouper("30T", key="datetime")]).count()
Out[198]:
close high low open volume
symbol datetime
British pound - United States dollar 2017-11-01 22:00:00 2 2 2 2 2
European Monetary Union Euro - United States do... 2017-11-01 22:00:00 2 2 2 2 2
United States dollar - Swiss franc 2017-11-01 22:00:00 1 1 1 1 1
or using Grouper
: 或使用
Grouper
:
In [203]: df.groupby(["symbol",pd.Grouper(freq="30T", key="datetime")]).count()
Out[203]:
close high low open volume
symbol datetime
British pound - United States dollar 2017-11-01 22:00:00 2 2 2 2 2
European Monetary Union Euro - United States do... 2017-11-01 22:00:00 2 2 2 2 2
United States dollar - Swiss franc 2017-11-01 22:00:00 1 1 1 1 1
PS DocString for TimeGrouper
could be bit more detailed: 用于
TimeGrouper
PS DocString可能会更加详细:
In [204]: pd.TimeGrouper?
Init signature: pd.TimeGrouper(*args, **kwargs)
Docstring:
Custom groupby class for time-interval grouping
Parameters
----------
freq : pandas date offset or offset alias for identifying bin edges
closed : closed end of interval; left or right
label : interval boundary to use for labeling; left or right
nperiods : optional, integer
convention : {'start', 'end', 'e', 's'}
If axis is PeriodIndex
It looks better for pd.Grouper
: 对于
pd.Grouper
看起来更好:
In [205]: pd.Grouper?
Init signature: pd.Grouper(*args, **kwargs)
Docstring:
A Grouper allows the user to specify a groupby instruction for a target
object
This specification will select a column via the key parameter, or if the
level and/or axis parameters are given, a level of the index of the target
object.
These are local specifications and will override 'global' settings,
that is the parameters axis and level which are passed to the groupby
itself.
Parameters
----------
key : string, defaults to None
groupby key, which selects the grouping column of the target
level : name/number, defaults to None
the level for the target index
freq : string / frequency object, defaults to None
This will groupby the specified frequency if the target selection
(via key or level) is a datetime-like object. For full specification
of available frequencies, please see `here
<http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases>`_.
axis : number/name of the axis, defaults to 0
sort : boolean, default to False
whether to sort the resulting labels
additional kwargs to control time-like groupers (when freq is passed)
closed : closed end of interval; left or right
label : interval boundary to use for labeling; left or right
convention : {'start', 'end', 'e', 's'}
If grouper is PeriodIndex
Returns
-------
A specification for a groupby instruction
Examples
--------
Syntactic sugar for ``df.groupby('A')``
>>> df.groupby(Grouper(key='A'))
Specify a resample operation on the column 'date'
>>> df.groupby(Grouper(key='date', freq='60s'))
Specify a resample operation on the level 'date' on the columns axis
with a frequency of 60s
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.