简体   繁体   English

使用rpy2将2d numpy数组保存为R文件格式

[英]Save 2d numpy array to R file format using rpy2

This is a beginner's question but how do you save a 2d numpy array to a file in (compressed) R format using rpy2? 这是一个初学者的问题,但是如何使用rpy2将2d numpy数组保存到(压缩)R格式的文件中? To be clear, I want to save it in rpy2 and then later read it in using R. I would like to avoid csv as the amount of data will be large. 为了清楚起见,我想将它保存在rpy2中,然后使用R读取它。我想避免使用csv,因为数据量会很大。

Looks like you want the save command . 看起来你想要save命令 I would use the pandas R interface and do something like the following. 我会使用pandas R接口并执行以下操作。

import numpy as np
from rpy2.robjects import r
import pandas.rpy.common as com
from pandas import DataFrame
a = np.array([range(5), range(5)])
df = DataFrame(a)
df = com.convert_to_r_dataframe(df)
r.assign("foo", df)
r("save(foo, file='here.gzip', compress=TRUE)")

There may be a more elegant way, though. 但是,可能会有更优雅的方式。 I'm open to better suggestions. 我愿意接受更好的建议。 The above, in R would be used: 以上,在R中将使用:

> load("here.gzip")
> foo
  X0 X1 X2 X3 X4
0  0  1  2  3  4
1  0  1  2  3  4

You can bypass the use of pandas and use numpy2ri from rpy2 . 您可以绕过使用pandas并使用rpy2中rpy2 With something like: 有类似的东西:

from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
a = np.array([[i*2147483647**2 for i in range(5)], range(5)], dtype="uint64")
a = np.array(a, dtype="float64") # <- convert to double precision numeric since R doesn't have unsigned ints
ro = numpy2ri(a)
r.assign("bar", ro)
r("save(bar, file='another.gzip', compress=TRUE)")

In R then: R

> load("another.gzip")
> bar
     [,1]         [,2]         [,3]         [,4]         [,5]
[1,]    0 4.611686e+18 9.223372e+18 1.383506e+19 1.844674e+19
[2,]    0 1.000000e+00 2.000000e+00 3.000000e+00 4.000000e+00

Suppose that you have a dataframe called data then the following code help me to store this data as a matrix in R and then load it into R (R studio) 假设您有一个名为data的数据帧,那么下面的代码帮助我将这些数据存储为R中的矩阵,然后将其加载到R(R studio)

save data to R 将数据保存到R.

# Take only the values of the dataframe
B=data.values

import rpy2.robjects as ro
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()

nr,nc = B.shape
Br = ro.r.matrix(B, nrow=nr, ncol=nc)

ro.r.assign("B", Br)
ro.r("save(B, file='here.Rdata')")

Then go to R and write this 然后转到R并写下来

load("D:/.../here.Rdata")

This has done the job for me! 这已经完成了我的工作!

Here's an example without pandas that adds column and row names 这是一个没有添加列名和行名的pandas的示例

import numpy as np
from rpy2.robjects import rinterface, r, IntVector, FloatVector, StrVector

# older (<2.1) versions of rpy2 have globenEvn vs globalenv
# let's fix it a little
if not hasattr(rinterface,'globalenv'):
        warnings.warn('Old version of rpy2 detected')
        rinterface.globalenv = rinterface.globalEnv

var_name = 'r_var'
vals = np.arange(20,dtype='float').reshape(4,5)

# transpose because R is column major vs python is row major 
r_vals = FloatVector(vals.T.ravel())
# make it  a matrix
rinterface.globalenv[var_name]=r['matrix'](r_vals,nrow=vals.shape[0])
# give it some row and column names
r("rownames(%s) <- c%s"%(var_name,tuple('ABCDEF'[i] for i in range(vals.shape[0]))))
r("colnames(%s) <- c%s"%(var_name,tuple(range(vals.shape[1]))))

#save it to file
r.save(var_name,file='r_from_py.rdata')

An alternative to rpy2 is to write a mat-file and load this mat-file from R. rpy2的替代方法是编写一个mat文件并从R加载这个mat文件。

in python: 在python中:

os.chdir("/home/user/proj") #specify a path to save to
import numpy as np
import scipy.io
x = np.linspace(0, 2 * np.pi, 100)
y = np.cos(x)
scipy.io.savemat('test.mat', dict(x=x, y=y))

example copied from: "Converting" Numpy arrays to Matlab and vice versa 示例复制自: “转换”Numpy数组到Matlab,反之亦然

in R 在R

library(R.matlab)
object_list = readMat("/home/user/proj/test.mat")

I'm a beginner in python. 我是python的初学者。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM