简体   繁体   English

使用R在Azure ML Jupyter / iPython Notebook中下载自定义数据集

[英]Download a custom dataset in Azure ML Jupyter/iPython Notebook using R

I need to download a custom dataset in an Azure Jupyter/iPython Notebook. 我需要在Azure Jupyter / iPython Notebook中下载自定义数据集。 My ultimate goal is to install an R package. 我的最终目标是安装R软件包。 To be able to do this the package (the dataset) needs to be downloaded in code. 为了做到这一点,需要以代码下载软件包(数据集)。 I followed the steps outlined by Andrie de Vries in the comments section of this post: Jupyter Notebooks with R in Azure ML Studio . 我遵循了Andrie de Vries在本文的评论部分中概述的步骤: Azure ML Studio中的带R的Jupyter Notebooks

Uploading the package as a ZIP file was without problems, but when I run the code in my notebook I get an error: 将软件包作为ZIP文件上传没有问题,但是当我在笔记本中运行代码时出现错误:

Error in curl(x$DownloadLocation, handle = h, open = conn): Failure when receiving data from the peer Traceback: curl(x $ DownloadLocation,handle = h,open = conn)中的错误:从对等端Traceback接收数据时失败:

  1. download.datasets(ws, "plotly_3.6.0.tar.gz.zip") download.datasets(ws,“ plotly_3.6.0.tar.gz.zip”)
  2. lapply(1:nrow(datasets), function(j) get_dataset(datasets[j, . ], ...)) lapply(1:nrow(datasets),function(j)get_dataset(datasets [j,。],...))
  3. FUN(1L[[1L]], ...) FUN(1L [[1L]],...)
  4. get_dataset(datasets[j, ], ...) get_dataset(数据集[j,],...)
  5. curl(x$DownloadLocation, handle = h, open = conn) curl(x $ DownloadLocation,handle = h,open = conn)

So I simplified my code into: 因此,我将代码简化为:

library("AzureML")
ws <- workspace()
ds <- datasets(ws)
ds$Name

data <- download.datasets(ws, "plotly_3.6.0.tar.gz.zip")
head(data)

Where "plotly_3.6.0.tar.gz.zip" is the name of my dataset of data type "Zip". 其中“ plotly_3.6.0.tar.gz.zip”是我的数据类型为“ Zip”的数据集的名称。 Unfortunately this results in the same error. 不幸的是,这会导致相同的错误。 To rule out data type issues I also tried to download another dataset of mine which is of data type "Dataset". 为了排除数据类型问题,我还尝试下载另一个我的数据集,其数据类型为“数据集”。 Also the same error. 也是一样的错误。

Now I change the dataset I want to download to one of the sample datasets of AzureML Studio. 现在,将要下载的数据集更改为AzureML Studio的示例数据集之一。 "text.preprocessing.zip" is of datatype Zip “ text.preprocessing.zip”的数据类型为Zip

data <- download.datasets(ws, "text.preprocessing.zip")

"Flight Delays Data" is of datatype GenericCSV “航班延误数据”的数据类型为GenericCSV

data <- download.datasets(ws, "Flight Delays Data")

Both of the sample datasets can be downloaded without problems. 可以免费下载两个示例数据集。

So why can't I download my own saved dataset? 那为什么我不能下载自己保存的数据集?

I could not find anything helpful in the documentation of the download.datasets function. 在download.datasets函数的文档中找不到任何有用的信息。 Not on rdocumentation.org , nor on cran.r-project.org (page 17-18) . 不在rdocumentation.org上 ,也不在cran.r-project.org上(第17-18页)

Try this: 尝试这个:

library(AzureML)
ws <- workspace(
id = "your AzureML ID",
auth = "your AzureML Key"
)
name <- "Name of your saved data"
ws <- workspace()

It seems the error I got was due to a bug in the (then early) Azure ML Studio. 看来我得到的错误是由于(当时较早)Azure ML Studio中的错误所致。

I tried again after the reply of Daniel Prager only to find out my code works as expected without any changes. 丹尼尔·普拉格Daniel Prager)的回复后,我再次尝试,只是发现我的代码可以按预期工作,没有任何更改。 Adding the id and auth parameters was not needed. 不需要添加idauth参数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM