简体   繁体   English

熊猫时间序列重采样的问题

[英]Issues with Pandas Timeseries Resample

I'm using python 3.5.1 and Pandas 0.18.0 and trying to use this notebook to modify financial tick data as the exercises are of interest to me: 我正在使用python 3.5.1和Pandas 0.18.0,并尝试使用此笔记本来修改财务报价数据,因为这些练习对我很感兴趣:

I'm having issues with some of the commands and wondered if this is due to the versions of python and pandas? 我对某些命令有疑问,想知道这是否归因于python和pandas的版本?

For example: 例如:

This is the file I am reading in with associated output: 这是我正在读取的文件,具有相关的输出:

data = pd.read_csv('test30dayes2tickforpython.csv',index_col=0,        header=0,parse_dates={"Timestamp" : [0,1]})
data.dtypes
Out[80]:
 Open              float64
 High              float64
 Low               float64
 Last              float64
 Volume              int64
 NumberOfTrades      int64
 BidVolume           int64
 AskVolume           int64
dtype: object

When I try to then create another object like this: 当我尝试创建另一个这样的对象时:

ticks = data.ix[:, ['High','Volume']]
ticks

I get NaN values: 我得到NaN值:

    High    Volume
Timestamp       
2015-12-27 23:00:25.000 NaN NaN
2015-12-27 23:01:11.000 NaN NaN

But if I use the column reference instead of names it works: 但是,如果我使用列引用而不是名称,它将起作用:

ticks = data.ix[:, [1,4]]
ticks


High    Volume
Timestamp       
2015-12-27 23:00:25.000 2045.25 1
2015-12-27 23:01:11.000 2045.50 2

Why is this? 为什么是这样?

Also, the notebook shows another object created: 同样,笔记本显示了另一个创建的对象:

bars = ticks.Price.resample('1min', how='ohlc')
bars

When I try this I get this error: 当我尝试这个我得到这个错误:

bars = ticks.High.resample('60min', how='ohlc')
bars

1 bars = ticks.High.resample('60min', how='ohlc') 1小节= ticks.High.resample('60min',how ='ohlc')
AttributeError: 'DataFrame' object has no attribute 'High' AttributeError:“ DataFrame”对象没有属性“ High”

It works if I don't call the High column: 如果我不调用“高”列,它将起作用:

bars = ticks.resample('60min', how='ohlc')
bars

FutureWarning: how in .resample() is deprecated the new syntax is .resample(...).ohlc() FutureWarning:在.resample()中如何弃用新语法为.resample(...)。ohlc()

High    Volume
open    high    low close   open    high    low close
Timestamp                               
2015-12-27 23:00:00 2045.25 2047.75 2045.25 2045.25 1.0 7.0 1.0 5.0

What is the correct command for this please? 请问正确的命令是什么?

I appreciate the notebook is probably not valid for the version of Python/Pandas Im using but as a newbie it is very useful for me so would like to get it working on my data . 我很高兴看到该笔记本可能不适用于所使用的Python / Pandas Im版本,但作为一个新手,它对我非常有用,因此希望它能在我的数据上正常工作。

There is problem spaces in column names. 列名称中存在问题spaces

print (data.columns)
Index(['Timestamp', ' Open', ' High', ' Low', ' Last', ' Volume',
       ' NumberOfTrades', ' BidVolume', ' AskVolume'],
      dtype='object')

You can strip this spaces: 您可以strip以下空格:

data.columns = data.columns.str.strip()
print (data.columns)
Index(['Timestamp', 'Open', 'High', 'Low', 'Last', 'Volume', 'NumberOfTrades',
       'BidVolume', 'AskVolume'],
      dtype='object')

ticks = data.ix[:, ['High','Volume']]
print (ticks.head())
      High  Volume
0  2045.25       1
1  2045.50       2
2  2045.50       2
3  2045.50       2
4  2045.50       2

Now you can use: 现在您可以使用:

print (ticks.Price.resample('1min', how='ohlc'))

If you dont want remove spaces, add space to column name: 如果您不想删除空格,请在列名称中添加空格:

print (ticks[' Price'].resample('1min', how='ohlc'))

But better is use Resampler.ohlc , if pandas version higher as 0.18.0 : 但最好将Resampler.ohlc使用,如果pandas版本高于0.18.0

print (ticks.Price.resample('1min').ohlc())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM