简体   繁体   中英

RBbg - Reshaping Time Series Data

Is there a better way of reshaping dataframe data?

temp <- bdh(conn,c("AUDUSD Curncy","EURUSD Curncy"),"PX_LAST","20110101")

gives

head(temp)
         ticker       date PX_LAST
1 AUDUSD Curncy 2011-01-01      NA
2 AUDUSD Curncy 2011-01-02      NA
3 AUDUSD Curncy 2011-01-03  1.0205
4 AUDUSD Curncy 2011-01-04  1.0040
5 AUDUSD Curncy 2011-01-05  1.0014
6 AUDUSD Curncy 2011-01-06  0.9969

and

tail(temp)
            ticker       date PX_LAST
2127 EURUSD Curncy 2013-11-26  1.3557
2128 EURUSD Curncy 2013-11-27  1.3570
2129 EURUSD Curncy 2013-11-28  1.3596
2130 EURUSD Curncy 2013-11-29  1.3591
2131 EURUSD Curncy 2013-11-30      NA
2132 EURUSD Curncy 2013-12-01      NA

in other words, the data are just vertically tacked on to each other and further processing is necessary in order to get them working. how can i regroup this data into the various tickers, ie

head(temp)
           AUDUSD.Curncy EURUSD.Curncy
2011-01-01            NA            NA
2011-01-02            NA            NA
2011-01-03        1.0205        1.3375
2011-01-04        1.0040        1.3315
2011-01-05        1.0014        1.3183
2011-01-06        0.9969        1.3028

All the reshaping questions I googled didnt have the kind of reshaping I wanted. I have implemented my own piecemeal solution given below but for learning's sake I wanted to ask you guys if there is a more elegant solution for this?

You could try read.zoo . Use index.column to specify in which column index/time is stored, and reshape data according to split columnn, . The result is a zoo time series

library(zoo)

z <- read.zoo(text = "ticker     date PX_LAST
1 AUDUSD 2011-01-01      NA
2 AUDUSD  2011-01-02      NA
3 AUDUSD 2011-01-03  1.0205
4 AUDUSD 2011-01-04  1.0040
5 AUDUSD  2011-01-05  1.0014
6 AUDUSD 2011-01-06  0.9969
2127 EURUSD  2013-11-26  1.3557
2128 EURUSD  2013-11-27  1.3570
2129 EURUSD  2013-11-28  1.3596
2130 EURUSD  2013-11-29  1.3591
2131 EURUSD  2013-11-30      NA
2132 EURUSD  2013-12-01      NA", index.column = "date", split = "ticker")

z
#            AUDUSD EURUSD
# 2011-01-01     NA     NA
# 2011-01-02     NA     NA
# 2011-01-03 1.0205     NA
# 2011-01-04 1.0040     NA
# 2011-01-05 1.0014     NA
# 2011-01-06 0.9969     NA
# 2013-11-26     NA 1.3557
# 2013-11-27     NA 1.3570
# 2013-11-28     NA 1.3596
# 2013-11-29     NA 1.3591
# 2013-11-30     NA     NA
# 2013-12-01     NA     NA

str(z)

This is exactly why we have created the RbbgExtension package. It is a wrapper around the Rbbg package that handles many issues when dealing with financial data - issues we have come across in our daily work with backtesting trading strategies etc. for a financial institution.

As you can see the output is a xts object, but if the query is across multiple tickers and multiple fields, then the output will an array - but you can read about why that is in the documentation.

We have made the package open source and publicly available on GitHub . Just use Hadley's devtools' function install_github("pgarnry/RbbgExtension") to get the package. It has a few dependencies including "Rbbg".

> require(RbbgExtension)
Loading required package: RbbgExtension
> 
> tickers <- c("AUDUSD", "EURUSD")
> 
> prices <- HistData(tickers = tickers,
+                    type = "Curncy",
+                    fields = "PX_LAST",
+                    startdate = "20110101")
R version 3.1.2 (2014-10-31) 
rJava Version 0.9-6 
Rbbg Version 0.5.3 
Java environment initialized successfully.
Looking for most recent blpapi3.jar file...
Adding C:\blp\API\APIv3\JavaAPI\v3.7.1.1\lib\blpapi3.jar to Java classpath
Bloomberg API Version 3.7.1.1 
> class(prices)
[1] "xts" "zoo"
> head(prices)
           AUDUSD EURUSD
2011-01-03 1.0168 1.3361
2011-01-04 1.0051 1.3308
2011-01-05 0.9995 1.3149
2011-01-06 0.9944 1.3003
2011-01-07 0.9959 1.2907
2011-01-10 0.9956 1.2951
> tail(prices)
           AUDUSD EURUSD
2015-01-26 0.7925 1.1238
2015-01-27 0.7937 1.1381
2015-01-28 0.7889 1.1287
2015-01-29 0.7762 1.1320
2015-01-30 0.7762 1.1291
2015-02-02 0.7806 1.1351

rbbg's blh (now bdh) is dumb. this outputs time series correctly.

bdhx <- function(conn,securities,start_date,end_date=NULL,fields="PX_LAST",override_fields = NULL,overrides = NULL) {
  temp <- bdh(conn=conn,securities=securities,fields=fields,start_date=start_date,end_date=end_date,override_fields=override_fields)
  if (colnames(temp)[1]=="date")
    {temp <- as.xts(temp)[,-1];colnames(temp) <- securities;res <- temp;}
  else
    {cn <- unique(temp[,1]);fil <- temp[,1]==cn[1];
     res <- xts(temp[fil,3],as.Date(temp[fil,2]));colnames(res) <- securities[1];
         for (i in 4:(length(cn)+2)){
          fil <- temp[,1]==cn[i-2]
          temp2 <- xts(temp[fil,3],as.Date(temp[fil,2]));colnames(temp2) <- securities[i-2];
          res <- merge.xts(res,temp2)}
     }
  res}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM