简体   繁体   English

rpy2 R->Python 数组:无维度转换

[英]rpy2 R->Python array: no dimension translation

Currently I'm trying to copy a complex nested tibble from R to Python using the rpy2 package. Since Python does not handle nested data very well, I'm splitting my data in two parts (meta data and several time series) and convert the time series data into an 3D array within R. So far so good, but as you can see here R handles the dimensions within the array different from Python. I was hoping that rpy2 would transform the dimension by itself, but as you can see in my MWE this is not the case:目前我正在尝试使用rpy2 package 将一个复杂的嵌套tibble从 R 复制到 Python。由于 Python 不能很好地处理嵌套数据,我将我的数据分成两部分(元数据和几个时间序列)并将时间序列数据到 R 内的 3D 数组中。到目前为止一切顺利,但正如您在此处看到的那样,R 处理数组中不同于 Python 的维度。我希望rpy2会自行转换维度,但正如您在我的 MWE 情况并非如此:

import rpy2.robjects as ro
import numpy as np

from rpy2.robjects import numpy2ri
from rpy2.robjects import default_converter
from rpy2.robjects.conversion import localconverter

ro.r(
    """
        f <- function() {
            data1 <- c(
                1, 2,  3,  4,
                5, 6,  7,  8,
                9, 10, 11, 12
            )
            data2 <- c(
                10, 20,  30,  40,
                50, 60,  70,  80,
                90, 100, 110, 120
            )
            result <- array(
                c(data1, data2),
                dim = c(4, 3, 2)
            )
            print(result)
            print(dim(result))
            return(result)
        }
    """
)

r_f = ro.globalenv["f"]
v_np = r_f()

print(type(v_np))
print("###################################")

with localconverter(default_converter + numpy2ri.converter) as cv:
    np_data_measurment = ro.conversion.rpy2py(v_np)

print(np_data_measurment)
print(type(np_data_measurment))
print(np_data_measurment.shape)
print("###################################")

np_good = np.array(
    [
        [
            [1, 5, 9],
            [2, 6, 10],
            [3, 7, 11],
            [4, 8, 12]],
        [
            [10, 50, 90],
            [20, 60, 100],
            [30, 70, 110],
            [40, 80, 120]],
    ]
)

print(np_good)
print(type(np_good))
print(np_good.shape)

print("###################################")
print(np_data_measurment.reshape(2, 4, 3, order='F'))

results in this: , , 1结果是: , , 1

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

, , 2

     [,1] [,2] [,3]
[1,]   10   50   90
[2,]   20   60  100
[3,]   30   70  110
[4,]   40   80  120

[1] 4 3 2
<class 'rpy2.robjects.vectors.FloatArray'>
###################################
[[[  1.  10.]
  [  5.  50.]
  [  9.  90.]]

 [[  2.  20.]
  [  6.  60.]
  [ 10. 100.]]

 [[  3.  30.]
  [  7.  70.]
  [ 11. 110.]]

 [[  4.  40.]
  [  8.  80.]
  [ 12. 120.]]]
<class 'numpy.ndarray'>
(4, 3, 2)
###################################
[[[  1   5   9]
  [  2   6  10]
  [  3   7  11]
  [  4   8  12]]

 [[ 10  50  90]
  [ 20  60 100]
  [ 30  70 110]
  [ 40  80 120]]]
<class 'numpy.ndarray'>
(2, 4, 3)
###################################
[[[  1.   9.  50.]
  [  3.  11.  70.]
  [  5.  10.  90.]
  [  7.  30. 110.]]

 [[  2.  10.  60.]
  [  4.  12.  80.]
  [  6.  20. 100.]
  [  8.  40. 120.]]]
(base) 

Now I am looking for a way to translate my data from R to Python in a way that keeps the dimesionality of the R-array.现在我正在寻找一种方法来将我的数据从 R 转换为 Python,以保持 R 数组的维度。 As you can see I also included an example as to how the ordering should look like np_good and tried to reshape the bad one (but I would prefer for a rpy2 way of reshaping the data).如您所见,我还提供了一个示例,说明排序应如何看起来像np_good并尝试重塑坏的(但我更喜欢重塑数据的 rpy2 方式)。

Do you have any idea, maybe a custom converter, as to how one can copy 3D arrays from R to Python while keeping the dimensions intact?关于如何将 3D arrays 从 R 复制到 Python,同时保持尺寸不变,您有什么想法吗?

What this boils down to IMO is how R and (C-based) numpy arrays are laid out in memory: R - column first, numpy - row first.这归结为 IMO 是 R 和(基于 C 的)numpy arrays 在 memory 中的布局方式:R - 列在前,numpy - 行在前。

A simple solution is to transpose your numpy array:一个简单的解决方案是转置 numpy 数组:

np_data_measurment.transpose((2,1,0))

This will give you the same display as R.这将为您提供与 R 相同的显示。

array([[[  1.,   2.,   3.,   4.],
        [  5.,   6.,   7.,   8.],
        [  9.,  10.,  11.,  12.]],

       [[ 10.,  20.,  30.,  40.],
        [ 50.,  60.,  70.,  80.],
        [ 90., 100., 110., 120.]]])

As long as you are not putting this transposed array back into R, you will be fine.只要您不将此转置数组放回 R,就可以了。 (You need to retranspose if you are doing so.) (如果你这样做,你需要重新转置。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM