Renko Chart Wiki: https://en.wikipedia.org/wiki/Renko_chart
I'm trying to generate a renko chart using the trade tick data. The tick data contains Timestamp, Price, Volume. The Timestamp is in unix milliseconds format. eg 1649289600174.
Pandas already supports OHLC resampling via df.resample('10Min').agg({'Price': 'ohlc', 'volume': 'sum'})
. However, I would like resample trade data based on price . Not by Timestamp.
Renko chart utilises a fixed brick size. For example, a brick_size of 10 would generate a brick if the price goes up 10 points, or go down 10 points.
I have been told by a pandas contributor that this can be done via groupby with a binned grouper
. However, I don't quite understand what he is talking about.
This is how my original data looks like.
Timestamp Price Volume
1649289600174 100 100
1649289600176 105 100
1649289600178 110 100
1649289600179 104 100
1649289600181 101 100
1649289600182 100 100
1649289600183 103 100
1649289600184 107 100
1649289600185 102 100
1649289600186 99 100
1649289600188 93 100
1649289600189 90 100
1649289600192 95 100
1649289600193 100 100
1649289600194 105 100
1649289600195 110 100
1649289600196 115 100
1649289600197 120 100
I'm looking for an option that looks like, df.resample('10Numeric').agg({'Price': 'ohlc', 'volume': 'sum'})
. Here 10Numeric
says, the brick_size is 10. If the price goes up 10 points, or go down 10 points, then I would like to aggregate the data within that period.
The output should look like
Timestamp Open High Low Close Volume
1649289600178 100 110 100 110 300
1649289600182 110 110 100 100 300
1649289600189 100 107 90 90 600
1649289600193 90 100 90 100 200
1649289600195 100 110 100 110 200
1649289600197 110 120 110 120 200
I believe the pandas contributor talking about pd.cut option. And then do groupby. Something like this.
import pandas as pd
import numpy as np
df = pd.DataFrame({'price': np.random.randint(1, 100, 1000)})
df['bins'] = pd.cut(x=df['price'], bins=[0, 10, 20, 30, 40, 50, 60,
70, 80, 90, 100])
That output looks like this.
price bins
0 92 (90, 100]
1 15 (10, 20]
2 54 (50, 60]
3 55 (50, 60]
4 72 (70, 80]
.. ... ...
95 88 (80, 90]
96 21 (20, 30]
97 91 (90, 100]
98 51 (50, 60]
99 18 (10, 20]
Please note: Price data is not unique. The price of bitcoin one year back would have looked like 45555 USD. But it's again at the same price this year. If I use a 100 bin size, it would be in (45500, 45600).
A groupby would put both 1 year ago data and current data at the same bin. I'm looking for a solution that follows the price movement. eg The closing prices should look like this 45500, 45600, 45700, 45600, 45500, 45400, 45300, 45200, 45100, 45000
Can someone explain what the pandas contributor mean when he says, groupby with a binned grouper
?
Is this what you're looking for?
df['bins'] = pd.cut(x=df['Price'], bins=range(df['Price'].min(), df['Price'].max(), 10))
df.groupby('bins').agg({'Price': 'ohlc', 'Volume': 'sum'})
Output:
Price Volume
open high low close Volume
bins
(90, 100] 100 100 93 100 600
(100, 110] 105 110 101 110 900
You could create a new column based on pd.cut
, do a cumsum
, and group by that.
import pandas as pd
import numpy as np
df = pd.DataFrame(
[
{"Timestamp": 1649289600174, "Price": 100, "Volume": 100},
{"Timestamp": 1649289600176, "Price": 105, "Volume": 100},
{"Timestamp": 1649289600178, "Price": 110, "Volume": 100},
{"Timestamp": 1649289600179, "Price": 104, "Volume": 100},
{"Timestamp": 1649289600181, "Price": 101, "Volume": 100},
{"Timestamp": 1649289600182, "Price": 100, "Volume": 100},
{"Timestamp": 1649289600183, "Price": 103, "Volume": 100},
{"Timestamp": 1649289600184, "Price": 107, "Volume": 100},
{"Timestamp": 1649289600185, "Price": 102, "Volume": 100},
{"Timestamp": 1649289600186, "Price": 99, "Volume": 100},
{"Timestamp": 1649289600188, "Price": 93, "Volume": 100},
{"Timestamp": 1649289600189, "Price": 90, "Volume": 100},
{"Timestamp": 1649289600192, "Price": 95, "Volume": 100},
{"Timestamp": 1649289600193, "Price": 100, "Volume": 100},
{"Timestamp": 1649289600194, "Price": 105, "Volume": 100},
{"Timestamp": 1649289600195, "Price": 110, "Volume": 100},
{"Timestamp": 1649289600196, "Price": 115, "Volume": 100},
{"Timestamp": 1649289600197, "Price": 120, "Volume": 100},
]
)
codes = pd.cut(df["Price"], bins=np.arange(0, 200, 10), right=False).cat.codes
df.groupby((codes != codes.shift(1)).cumsum()).agg(
{"Price": "ohlc", "Volume": "sum", "Timestamp": "min"}
)
This'll give you:
Price Volume Timestamp
open high low close Volume Timestamp
1 100 105 100 105 200 1649289600174
2 110 110 110 110 100 1649289600178
3 104 107 100 102 600 1649289600179
4 99 99 90 95 400 1649289600186
5 100 105 100 105 200 1649289600193
6 110 115 110 115 200 1649289600195
7 120 120 120 120 100 1649289600197
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.