简体   繁体   中英

Converting an RPy2 ListVector to a Python dictionary

The natural Python equivalent to a named list in R is a dict, but RPy2 gives you a ListVector object.

import rpy2.robjects as robjects

a = robjects.r('list(foo="barbat", fizz=123)')

At this point, a is a ListVector object.

<ListVector - Python:0x108f92a28 / R:0x7febcba86ff0>
[StrVector, FloatVector]
  foo: <class 'rpy2.robjects.vectors.StrVector'>
  <StrVector - Python:0x108f92638 / R:0x7febce0ae0d8>
[str]
  fizz: <class 'rpy2.robjects.vectors.FloatVector'>
  <FloatVector - Python:0x10ac38fc8 / R:0x7febce0ae108>
[123.000000]

What I'd like to have is something I can treat like a normal Python dictionary. My temporary hack-around is this:

def as_dict(vector):
    """Convert an RPy2 ListVector to a Python dict"""
    result = {}
    for i, name in enumerate(vector.names):
        if isinstance(vector[i], robjects.ListVector):
            result[name] = as_dict(vector[i])
        elif len(vector[i]) == 1:
            result[name] = vector[i][0]
        else:
            result[name] = vector[i]
    return result

as_dict(a)
{'foo': 'barbat', 'fizz': 123.0}

b = robjects.r('list(foo=list(bar=1, bat=c("one","two")), fizz=c(123,345))')
as_dict(b)
{'fizz': <FloatVector - Python:0x108f7e950 / R:0x7febcba86b90>
 [123.000000, 345.000000],
 'foo': {'bar': 1.0, 'bat': <StrVector - Python:0x108f7edd0 / R:0x7febcba86ea0>
  [str, str]}}

So, the question is... Is there a better way or something built into RPy2 that I should be using?

I think to get ar vector into a dictionary does not have to be so involving, how about this:

In [290]:

dict(zip(a.names, list(a)))
Out[290]:
{'fizz': <FloatVector - Python:0x08AD50A8 / R:0x10A67DE8>
[123.000000],
 'foo': <StrVector - Python:0x08AD5030 / R:0x10B72458>
['barbat']}
In [291]:

dict(zip(a.names, map(list,list(a))))
Out[291]:
{'fizz': [123.0], 'foo': ['barbat']}

And of course, if you don't mind using pandas , it is even easier. The result will have numpy.array instead of list , but that will be OK in most cases:

In [294]:

import pandas.rpy.common as com
com.convert_robj(a)
Out[294]:
{'fizz': [123.0], 'foo': array(['barbat'], dtype=object)}

Simple R list to Python dictionary:

>>> import rpy2.robjects as robjects
>>> a = robjects.r('list(foo="barbat", fizz=123)')
>>> d = { key : a.rx2(key)[0] for key in a.names }
>>> d
{'foo': 'barbat', 'fizz': 123.0}

Arbitrary R object to Python object using R RJSONIO JSON serialization/deserialization

On R server: install.packages("RJSONIO", dependencies = TRUE)

>>> ro.r("library(RJSONIO)")
<StrVector - Python:0x300b8c0 / R:0x3fbccb0>
[str, str, str, ..., str, str, str]
>>> import rpy2.robjects as robjects
>>> rjson = robjects.r(' toJSON( list(foo="barbat", fizz=123, lst=list(33,"bb")) )  ')
>>> pyobj = json.loads( rjson[0] )
>>> pyobj
{u'lst': [33, u'bb'], u'foo': u'barbat', u'fizz': 123}
>>> pyobj['lst']
[33, u'bb']
>>> pyobj['lst'][0]
33
>>> pyobj['lst'][1]
u'bb'
>>> rjson = robjects.r(' toJSON( list(foo="barbat", fizz=123, lst=list( key1=33,key2="bb")) )  ')
>>> pyobj = json.loads( rjson[0] )
>>> pyobj
{u'lst': {u'key2': u'bb', u'key1': 33}, u'foo': u'barbat', u'fizz': 123}

I had the same problem with a deeply nested structure of different rpy2 vector types. I couldn't find a direct answer anywhere on stackoverflow, so here's my solution. Using CT Zhu's answer, I came up with the following code to convert the complete structure to python types recursively.

from rpy2.robjects.vectors import DataFrame, FloatVector, IntVector, StrVector, ListVector
import numpy
from collections import OrderedDict

def recurList(data):
    rDictTypes = [ DataFrame,ListVector]
    rArrayTypes = [FloatVector,IntVector]
    rListTypes=[StrVector]
    if type(data) in rDictTypes:
        return OrderedDict(zip(data.names, [recurList(elt) for elt in data]))
    elif type(data) in rListTypes:
        return [recurList(elt) for elt in data]
    elif type(data) in rArrayTypes:
        return numpy.array(data)
    else:
        if hasattr(data, "rclass"): # An unsupported r class
            raise KeyError('Could not proceed, type {} is not defined'.format(type(data)))
        else:
            return data # We reached the end of recursion

You can also do the following:

In

dict(a.items())

Out

{'foo': R object with classes: ('character',) mapped to:
 ['barbat'], 'fizz': R object with classes: ('numeric',) mapped to:
 [123.000000]}

With the new version of pandas, one could also do,

import rpy2.robjects as robjects
a = robjects.r('list(foo="barbat", fizz=123)')

from rpy2.robjects import pandas2ri
print(pandas2ri.ri2py(a.names))
temp = pandas2ri.ri2py(a)
print(temp[0])
print(temp[1])

The following is my function for the conversion from an rpy2 ListVector to a python dict, capable of handling nested lists:

import rpy2.robjects as ro
from rpy2.robjects import pandas2ri

def r_list_to_py_dict(r_list):
    converted = {}
    for name in r_list.names:
        val = r_list.rx(name)[0]
        if isinstance(val, ro.vectors.DataFrame):
            converted[name] = pandas2ri.ri2py_dataframe(val)
        elif isinstance(val, ro.vectors.ListVector):
            converted[name] = r_list_to_py_dict(val)
        elif isinstance(val, ro.vectors.FloatVector) or isinstance(val, ro.vectors.StrVector):
            if len(val) == 1:
                converted[name] = val[0]
            else:
                converted[name] = list(val)
        else: # single value
            converted[name] = val
    return converted

A simple function to convert nested R named lists into a nested Python dictionary :

def rext(r):
    """
    Returns a R named list as a Python dictionary
    """
    # In case `r` is not a named list
    try:
        # No more names, just return the value!
        if r.names == NULL:
            # If more than one value, return numpy array (or list)
            if len(list(r)) > 1:
                return np.array(r)
            # Just one value, return the value
            else:
                return list(r)[0]
        # Create dictionary to hold named list as key-value
        dic = {}
        for n in list(r.names):
            dic[n] = rext(r[r.names.index(n)])
        return dic
    # Uh-oh `r` is not a named list, just return `r` as is
    except:
        return r

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM