[英]Reading netCDF files stored on a remote filesystem in R?
I need to read a netCDF file into R that is stored on a remote filesystem. 我需要将netCDF文件读入R,该文件存储在远程文件系统中。 I do have ssh access to the filesystem, but the files are too big to store onto my local computer.
我确实可以通过ssh访问文件系统,但是文件太大而无法存储到本地计算机上。
I have tried the advice from here: Can R read from a file through an ssh connection? 我从这里尝试了以下建议: R可以通过ssh连接从文件读取吗? I tried the following:
我尝试了以下方法:
library(ncdf)
d = open.ncdf(pipe('ssh hostname "path/to/file/foo.nc"'))
However, I keep getting the error 但是,我不断收到错误
bash: path/to/file/foo.nc: Permission denied
Any ideas on how to fix this? 有想法该怎么解决这个吗?
It is not possible to open the file directly from within R using ssh, but there are a few options available to you. 无法使用ssh从R内部直接打开文件,但是有一些可用的选项。
There are packages which will let you mount remote machines as local filesystems over ssh; 有一些软件包可以让您通过ssh将远程计算机作为本地文件系统挂载。 on Linux, for example, you might use
sshfs
whereas on Windows you might use win-sshfs
. 例如,在Linux上,您可以使用
sshfs
而在Windows上,则可以使用win-sshfs
。 Once you've mounted the remote file system, you would be able to access the netcdf files from R just as you would any other file, although I'm not sure what the performance implications may be. 一旦安装了远程文件系统,就可以像其他任何文件一样从R访问netcdf文件,尽管我不确定性能可能会如何。
Use the command-line ncdump
utility, on the server, to create smaller files from the large files which are able to fit on your local file system. 使用服务器上的命令行
ncdump
实用程序,从大文件中创建较小的文件,这些文件可以放入本地文件系统。
$ ncdump -v [var1],[var2] big.nc > smaller.cdl
$ ncdump -v [var1],[var2] big.nc> small.cdl
smaller.cdl will be a text file; small.cdl将是一个文本文件; you can generate a binary netcdf
.nc
file by using ncgen
: 您可以使用
ncgen
生成一个二进制netcdf .nc
文件:
$ ncgen -b -o smaller.nc smaller.cdl
$ ncgen -b -o较小.nc较小.cdl
Unless your remote server is already set up to provide OpenDAP service, this is probably overkill. 除非您的远程服务器已设置为提供OpenDAP服务,否则这可能是过分的。 But if it is, you may use a combination of R's OPeNDAP access and netCDF's OPenDAP subset service to retrieve data subsets on the fly.
但是,如果是这样,则可以结合使用R的OPeNDAP访问和netCDF的OPenDAP子集服务来即时检索数据子集。 You can also use
ncdump
on your local machine to request a subset of data from the server. 您还可以在本地计算机上使用
ncdump
从服务器请求数据的子集。
I'd try and arrange a samba or NFS share. 我会尝试安排samba或NFS共享。 After that you can simply approach the file as any other.
之后,您可以像处理其他文件一样简单地处理文件。
It is not possible to do it via ssh. 无法通过ssh进行操作。 The
pipe
command executes a shell command. pipe
命令执行一个shell命令。 You are trying to execute path/to/file/foo.nc
, which fails because it is not an executable. 您正在尝试执行
path/to/file/foo.nc
,因为它不是可执行path/to/file/foo.nc
,所以失败了。 The examples you gave read output from stdin, which is parsed by R. This is not the same. 您提供的示例从stdin读取了输出,该输出由R解析。这是不同的。
The closest you could get is to use ncdump
on the remote machine, which can be used to convert variables from the files into a text version, which you may be able to parse. 您能获得的最接近的结果是在远程计算机上使用
ncdump
,该计算机可用于将文件中的变量转换为文本版本,您可以解析该文本版本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.