简体   繁体   中英

issue converting python pandas DataFrame to R dataframe for use with rpy2

I am having trouble converting a pandas DataFrame in Python to an R object, for future use in R using rpy2.

The new pandas release 0.8.0 (released a few weeks ago) has a function to convert pandas DataFrames to R DataFrames. The problem is in converting the first column of my pandas DataFrame, which consists of python datetime objects (successively, in a time series). The conversion into an R dataframe returns an StrVector of the dates and times, rather than a vector of R datetime-type objects which I believe are called "POSIXct" objects.

I know the command to convert a string of the type returned to a POSIXct, using the command "as.POSIXct('yyyy-mm-dd hh:mm:ss')". Unfortunately I have not been able to figure out the way to convert all these strings in the StrVector to POSIXct using python and rpy2. The dates need to be in the POSIXct format to be used with the TTR library in R. Below is the relevant python code:

import pandas
from pandas import *
import pandas.rpy.common as com
import rpy2.robjects as robjects
r = robjects.r
r.library('TTR')        #library contains the function ADX, to be used later

dataframe = read_csv('file_name', parse_dates = [0], names  = ['Date','Col1','Col2','Col3']     #command makes 1st column into datetime.datetime object
r_dataframe = com.convert_to_r_dataframe(dataframe)

ADX = r['ADX']          #creating a name for an R function in python
adx = ADX(r_dataframe)    #will not work because the dates in r_dataframe are in a StrVector

Further I do not believe that the StrVector can be iterated through to convert each object to a POSIXct object individually, due to the definition of a StrVector. Maybe there is a way to cast a StrVector to a generic one?

Any help/insight into this matter is greatly appreciated. I am a novice programmer and have been working on this for a couple hours now to no avail.

Thank you!

The reason your ADX call fails is because it expects an xts or matrix-like object with 3 columns: High, Low, Close. Your object contains 4 columns. Drop the date column before passing r_dataframe to ADX and everything should work. You can then add the datetime column back to the ADX output.

Or, if you can set the row.names attribute of your R data.frame to the values of the Date column and then remove the Date column, you can convert your R data.frame to an xts object by calling as.xts(r.data.frame) . Then you can pass that to ADX and convert the result back to a pandas DataFrame.

GaleHub上的dalejung最近在使用rpy2创建更紧密的pandas-xts接口方面做了大量工作,您可以与他联系或加入PyData邮件列表

It's not answer what you want. But how about using piper library?

It's just "pipe" between python and R. Thus it does not rarely occur problem something about converting. https://pypi.python.org/pypi/piper

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM