简体   繁体   English

将.RData文件加载到Data Science Experience中

[英]Loading .RData file into Data Science Experience

I am trying to load a .RData file into my R Notebook in DSX. 我正在尝试将.RData文件加载到DSX中的R Notebook中。 I have followed the instructions in this notebook ( https://apsportal.ibm.com/exchange/public/entry/view/90a34943032a7fde0ced0530d976ca82 ) but am still unable to load my data. 我已按照本笔记本( https://apsportal.ibm.com/exchange/public/entry/view/90a34943032a7fde0ced0530d976ca82 )中的说明进行操作,但仍无法加载我的数据。 So far, I have been successful in the following steps: 到目前为止,我已经成功完成了以下步骤:

  1. I have loaded my dataset into object storage. 我已经将数据集加载到对象存储中。
  2. I inserted my credentials using the Insert to code -> Insert Credentials button. 我使用“插入代码”->“插入凭据”按钮插入了凭据。 This seemed to work as expected. 这似乎按预期工作。
  3. In the next cell, I chose the Insert to code -> Insert textConnection object option. 在下一个单元格中,我选择了“插入代码->插入textConnection对象”选项。 This seemed to work as expected also. 这似乎也按预期工作。
  4. The output of step # 3 was as follows: 步骤#3的输出如下:

Your data file was loaded into a textConnection object and you can process the data with your package of choice. 您的数据文件已加载到textConnection对象中,您可以使用选择的包处理数据。

data.1 <- getObjectStorageFileWithCredentials_xxxxxxxxxx("projectname", "file.RData") data.1 <-getObjectStorageFileWithCredentials_xxxxxxxxxx(“ projectname”,“ file.RData”)

  1. After this, since my file is a .RData file, I typed the following command: 此后,由于我的文件是.RData文件,因此我键入了以下命令:

data <- load("file.RDA") 数据<-load(“ file.RDA”)

When I ran this cell, I got the following output: 运行此单元格时,得到以下输出:

Warning message in readChar(con, 5L, useBytes = TRUE): “cannot open compressed file 'file.RDA', probable reason 'No such file or directory'” readChar(con,5L,useBytes = TRUE)中的警告消息:“无法打开压缩文件'file.RDA',可能的原因是'没有这样的文件或目录'”

Error in readChar(con, 5L, useBytes = TRUE): cannot open the connection Traceback: readChar(con,5L,useBytes = TRUE)中的错误:无法打开连接回溯:

  1. load("file.RDA") 加载(“ file.RDA”)
  2. readChar(con, 5L, useBytes = TRUE) readChar(con,5L,useBytes = TRUE)

  3. When I type in the following command to print the dataset: 当我键入以下命令以打印数据集时:

data 数据

I get the following output: 我得到以下输出:

X.html..h1.Forbidden..h1..p.Access.was.denied.to.this.resource...p...html. X.html..h1.Forbidden..h1..p.Access.was.denied.to.this.resource ... p ... html。

Please can someone help? 请有人帮忙吗?

Thanks, Venky 谢谢,Venky

Here is a workaround given that load can't read from a response object since to read objects from Object storage, only way is the REST api. 鉴于无法从响应对象读取负载,这是一种解决方法,因为要从对象存储中读取对象,唯一的方法就是REST api。

I tried to use rawConnection instead of textConnection but it seems to be not helping. 我尝试使用rawConnection代替textConnection,但似乎没有帮助。

So instead of passing the read object from OS directly to load or readRDS function.You can write it to GPFS of spark service attached and read it from there same as reading from local. 因此,您无需将读取的对象直接从OS传递给load或readRDS函数,您可以将其写入附加的spark服务的GPFS中并从本地读取,就像从本地读取一样。

Change this lines from generated code:- 从生成的代码更改此行:

    rawdata <- content(httr::GET(url = access_url, add_headers ("Content-Type" = "application/json", "X-Auth-Token" = x_subject_token)), as="raw")
rawdata

Basically instead of returning text , return raw object and then write that as binary object to local GPFS. 基本上,不是返回文本,而是返回原始对象,然后将其作为二进制对象写入本地GPFS。

data.3 <- getObjectStorageFileWithCredentials_216c032f3f574763ae975c6a83a0d523("testObjectStorage", "sample.rdata")


writeBin(data.3,"sample.rdata")

Now read it back using readRDS or load. 现在,使用readRDS或加载将其读回。

load("sample.rdata")

To see loaded dataframe. 查看加载的数据框。 ls() ls()

I hope it helps. 希望对您有所帮助。

Thanks, Charles. 谢谢,查尔斯。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM