简体   繁体   English

在 python 中读取 RDa 文件作为 Pandas 数据框

[英]reading RDa file in python as a pandas data frame

I have an RDa file that I created in R. I want to read this file on python as a pandas dataframe.我有一个在 R 中创建的 RDa 文件。我想在 python 上读取这个文件作为 Pandas 数据帧。 I have the following code to do the same:我有以下代码来做同样的事情:

import rpy2.robjects as robjects
import numpy as np
from rpy2.robjects import pandas2ri
pandas2ri.activate()

# load your file
robjects.r['load']('Data.RDa')

matrix = robjects.r['data']

matrix

I get the following results:我得到以下结果:

R object with classes: ('data.frame',) mapped to:
<DataFrame - Python:0x0CF46F58 / R:0x0ED0F200>
[Float..., Float..., Float..., ..., Float..., Float..., Float...]
  area: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0CF56A80 / R:0x0F281898>
[NA_real_, NA_real_, NA_real_, ..., NA_real_, NA_real_, NA_real_]
  i: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0CF68E68 / R:0x0F2B9520>
[NA_real_, NA_real_, NA_real_, ..., NA_real_, NA_real_, NA_real_]
  s: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0CF68940 / R:0x0F380008>
[NA_real_, NA_real_, NA_real_, ..., NA_real_, NA_real_, NA_real_]
  ...
  upslope_area: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0D03FDA0 / R:0x0FE87C90>
[NA_real_, NA_real_, NA_real_, ..., 292.256494, NA_real_, NA_real_]
  i: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0D03FC88 / R:0x0FEBF918>
[331347.500000, 331352.500000, 331357.500000, ..., 332187.500000, 332192.500000, 332197.500000]
  s: <class 'rpy2.robjects.vectors.FloatVector'>
  R object with classes: ('numeric',) mapped to:
<FloatVector - Python:0x0D03FE68 / R:0x0FEF75A0>
[4554812.500000, 4554812.500000, 4554812.500000, ..., 4553982.500000, 4553982.500000, 4553982.500000]

How do I convert this to a pandas data frame?如何将其转换为熊猫数据框?

This looks like a missing call to the current conversion when retrieving from the search path the first R object with the symbol 'data' (in short, when doing robjects.r["data"] ).当从搜索路径中检索带有符号 'data' 的第一个 R 对象(简而言之,在执行robjects.r["data"] )时,这看起来像是缺少对当前转换的调用。 Open an issue on the rpy2 tracker if there isn't already one, or make noise in the comments for the issue already opened if unresolved or assumed to be resolved prematurely.如果还没有问题,请在 rpy2 跟踪器上打开一个问题,或者如果尚未解决或假设过早解决,则在已经打开的问题的评论中制造噪音。

Calling explicitly conversion rules limited to a code block should make an easy workaround, and may be help you ensure good performances.调用仅限于代码块的显式转换规则应该是一个简单的解决方法,并且可能有助于确保良好的性能。 The conversion mechanism provides convenience, but often at the expense of performances as a copy of the data frame is made each time in either direction the conversion is going.转换机制提供了便利,但通常以牺牲性能为代价,因为每次在转换进行的任一方向上都会制作数据帧的副本。

Here is what would look like:这是看起来的样子:

from rpy2.robjects import default_converter
from rpy2.robjects import pandas2ri
from rpy2.robjects.conversion import localconverter

# use the default conversion rules to which the pandas conversion
# is added
with localconverter(default_converter + pandas2ri.converter) as cv:
    dataf = robjects.r["data"]

This is in the doc: http://rpy2.readthedocs.io/en/version_2.8.x/robjects_convert.html#local-conversion-rules这是在文档中: http : //rpy2.readthedocs.io/en/version_2.8.x/robjects_convert.html#local-conversion-rules

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM