简体   繁体   English

DataFrame 按索引条件拆分。 将 DataFrame 拆分成条形

[英]DataFrame split by index condition. Split DataFrame into Bars

I need to split Python DataFrame with datetime index column into Bars with specific "Length".我需要将带有日期时间索引列的 Python DataFrame 拆分为具有特定“长度”的条形。 Ie one dataframe should be splitted into 30min bars, 60min bars, 240 min bars and etc. Currently I'm trying to split it with loop like this:即一个 dataframe 应该被分成 30 分钟条、60 分钟条、240 分钟条等。目前我正在尝试用这样的循环来分割它:

class Bar:
    def __init__(self, o, h, l, c, v):
        self.Open = o
        self.High = h
        self.Low = l
        self.Close = c
        self.Volume = v
def GenerateBar(smallBars):
    o = smallBars["Open"][0]
    h = smallBars["High"].max()
    l = smallBars["Low"].min()
    c = smallBars["Close"][-1]
    vol = smallBars["Volume"].sum()
    return Bar(o, h, l, c, vol)
def GenerateBars(df, duration):
    while cur_time <= last_date:
        next_time = cur_time + timedelta(minutes = duration)
        cur_bar = GenerateBar(df[(df.index >= cur_time)&(df.index < next_time)])

But this way takes a way too long to split DataFrame into bars.但是这种方式需要很长时间才能将 DataFrame 拆分为条形。

If you have a dataframe with a datetimeIndex, you can use resample .如果你有一个带有 datetimeIndex 的 dataframe,你可以使用resample Also check out this guide .另请查看本指南

Assuming df is your dataframe you want to convert, a resample for 30 minutes (30T) would look like this:假设 df 是您要转换的 dataframe ,则重新采样 30 分钟(30T)将如下所示:

df = df.resample('30T').sum()

With this simple solution you have a problem if you need the mean of one column (eg price) and the sum of another column (eg volume).使用这个简单的解决方案,如果您需要一列的平均值(例如价格)和另一列的总和(例如交易量),您就会遇到问题。 You can do this by using df.agg .您可以使用df.agg来做到这一点。

df = df.resample('30T').agg(['mean', 'sum'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM