[英]How do I incrementally add rows in Pandas Dataframe?
I am calculating the open-high-low-close (OHLC) of data for a duration of each 15 mins from 9:15 to 15:30 and want to store the OHLC values in a dataframe in each new row.我正在计算从 9:15 到 15:30 每 15 分钟的数据的开-高-低-收盘 (OHLC),并希望将 OHLC 值存储在每个新行的数据帧中。
ohlc = pd.DataFrame(columns=('Open','High','Low','Close'))
for row in ohlc:
ohlc.loc[10] = pd.DataFrame([[candle_open_price,candle_high_price,candle_low_price,candle_close_price]])
But I am not able to do say getting an error of:但我不能说得到以下错误:
ValueError: cannot set a row with mismatched columns
Just I want to incrementally store the OHLC data of each 15-minute duration which I have calculated & put in rows of the new ohlc dataframe只是我想增量地存储我计算出的每 15 分钟持续时间的 OHLC 数据并将其放入新的 ohlc 数据帧的行中
EDIT编辑
import numpy as np
import pandas as pd
import datetime as dt
import matplotlib as plt
import dateutil.parser
tradedata = pd.read_csv('ICICIBANK_TradeData.csv', index_col=False,
names=['Datetime','Price'],
header=0)
tradedata['Datetime'] = pd.to_datetime(tradedata['Datetime'])
first_trd_time = tradedata['Datetime'][0]
last_time = dateutil.parser.parse('2016-01-01 15:30:00.000000')
candle_time = 15;
candle_number = 0
while(first_trd_time < last_time):
candledata = tradedata[(tradedata['Datetime']>first_trd_time) & (tradedata['Datetime']<first_trd_time+dt.timedelta(minutes=candle_time))]
first_trd_time = first_trd_time+dt.timedelta(minutes=candle_time)
candle_open_price = candledata.iloc[0]['Price']
candle_open_time = candledata.iloc[0]['Datetime']
candle_close_price = candledata.iloc[-1]['Price']
candle_close_time = candledata.iloc[-1]['Datetime']
candle_high_price = candledata.loc[candledata['Price'].idxmax()]['Price']
candle_high_time = candledata.loc[candledata['Price'].idxmax()]['Datetime']
candle_low_price = candledata.loc[candledata['Price'].idxmin()]['Price']
candle_low_time = candledata.loc[candledata['Price'].idxmin()]['Datetime']
ohlc = pd.DataFrame(columns=('Open','High','Low','Close'))
ohlc_data = pd.DataFrame()
if(candle_number == 0):
ohlc = pd.DataFrame(np.array([[0, 0, 0, 0]]), columns=['Open', 'High', 'Low', 'Close']).append(ohlc, ignore_index=True)
candle_number = candle_number + 1
print "Zeroth Candle"
else:
ohlc.ix[candle_number] = (candle_open_price,candle_open_price,candle_open_price,candle_open_price)
print "else part with incermenting candle_number"
candle_number = candle_number + 1
print "first_trd_time"
print first_trd_time
print candle_number
print "Success!"
This is my code error is这是我的代码错误是
ValueError: cannot set by positional indexing with enlargement
IIUC you can append DataFrames for each row to list of DataFrames dfs
and then concat
them to df1
: IIUC可以追加的每一行DataFrames到DataFrames的列表
dfs
,然后concat
他们df1
:
ohlc = pd.DataFrame(columns=('Open','High','Low','Close'))
dfs = []
for row in ohlc.iterrows():
df = pd.DataFrame([candle_open_price,candle_high_price,
candle_low_price,candle_close_price]).T
dfs.append(df)
df1 = pd.concat(dfs, ignore_index=True)
print (df1)
Then concat
to original DataFrame
ohlc
:然后
concat
到原始DataFrame
ohlc
:
df2 = pd.concat([ohlc,df1])
print (df2)
Sample (for testing in each iteration of loop are added same data):示例(为了在循环的每次迭代中进行测试都添加了相同的数据):
#sample data
candle_open_price = pd.Series([1.5,10],
name='Open',
index=pd.DatetimeIndex(['2016-01-02','2016-01-03']) )
candle_high_price = pd.Series([8,9],
name='High',
index=pd.DatetimeIndex(['2016-01-02','2016-01-03']))
candle_low_price = pd.Series([0,12],
name='Low',
index=pd.DatetimeIndex(['2016-01-02','2016-01-03']))
candle_close_price = pd.Series([4,5],
name='Close',
index=pd.DatetimeIndex(['2016-01-02','2016-01-03']))
data = np.array([[1,2,3,5],[7,7,8,9],[10,8,9,3]])
idx = pd.DatetimeIndex(['2016-01-08','2016-01-09','2016-01-10'])
ohlc = pd.DataFrame(data=data,
columns=('Open','High','Low','Close'),
index=idx)
print (ohlc)
Open High Low Close
2016-01-08 1 2 3 5
2016-01-09 7 7 8 9
2016-01-10 10 8 9 3
dfs = []
for row in ohlc.iterrows():
df = pd.DataFrame([candle_open_price,candle_high_price,
candle_low_price,candle_close_price]).T
#print (df)
dfs.append(df)
df1 = pd.concat(dfs)
print (df1)
Open High Low Close
2016-01-02 1.5 8.0 0.0 4.0
2016-01-03 10.0 9.0 12.0 5.0
2016-01-02 1.5 8.0 0.0 4.0
2016-01-03 10.0 9.0 12.0 5.0
2016-01-02 1.5 8.0 0.0 4.0
2016-01-03 10.0 9.0 12.0 5.0
df2 = pd.concat([ohlc,df1])
print (df2)
Open High Low Close
2016-01-08 1.0 2.0 3.0 5.0
2016-01-09 7.0 7.0 8.0 9.0
2016-01-10 10.0 8.0 9.0 3.0
2016-01-02 1.5 8.0 0.0 4.0
2016-01-03 10.0 9.0 12.0 5.0
2016-01-02 1.5 8.0 0.0 4.0
2016-01-03 10.0 9.0 12.0 5.0
2016-01-02 1.5 8.0 0.0 4.0
2016-01-03 10.0 9.0 12.0 5.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.